Skip to content

20260514-172210 Review fixes for #621#705

Open
aram356 wants to merge 3 commits into
feature/edge-cookies-finalfrom
review/20260514-172210-621
Open

20260514-172210 Review fixes for #621#705
aram356 wants to merge 3 commits into
feature/edge-cookies-finalfrom
review/20260514-172210-621

Conversation

@aram356
Copy link
Copy Markdown
Collaborator

@aram356 aram356 commented May 17, 2026

Summary

Implements review findings for #621 so the branch is closer to a clean merge candidate. Stacked on top of #621 — merge into that branch to absorb the fixes. Rebased onto the latest feature/edge-cookies-final (aa9c96eb) so it applies cleanly.

Findings addressed

  • 🔧 P2-2 — /auction body parsed without size cap (auction/endpoints.rs). handle_auction previously did serde_json::from_slice(&req.take_body_bytes()) with no limit, so an authenticated client could buffer arbitrary bytes into the Wasm worker. Now rejects with 413 via a Content-Length precheck for well-behaved clients plus a post-read length check for clients that lie about or omit the header. Cap is 256 KiB. Check is inlined rather than calling a shared helper to avoid fighting the in-flight HTTP-types migration in http_util.rs.
  • ♻️ P2-8 — IntegrationRegistry::handle_proxy >7-arg #[allow(clippy::too_many_arguments)] (integrations/registry.rs, adapter-fastly/main.rs). CLAUDE.md disallows >7 args. Bundled method/path/settings/kv/ec_context/req under one lifetime via ProxyDispatchInput; destructured inside; allow-attr removed; the four unit-test sites + the adapter dispatch site updated.
  • P2-10/11/12/13 — leftover synthetic-ID and stale endpoint references (test_support.rs, ec/mod.rs, scenarios.rs, api-reference.md). Dropped the dead VALID_SYNTHETIC_ID test constant + "synthetic ID" doc comment, rewrote EcContext::consent_mut doc to stop citing the never-implemented /_ts/api/v1/sync endpoint, updated the EcScenario doc comment to list only endpoints that actually exist, and extended the /identify response example with cluster_size + a note on optional fields.

Findings dropped this pass

  • P2-1 (CI passphrase length) — already fixed upstream by aa9c96eb (integration-test-ec-secret-padded-32).
  • P2-9 (lift content_length_exceeds_limit to http_util)http_util has migrated to http::Request<EdgeBody> but batch_sync/auction still use fastly::Request. Sharing now would either specialize the helper to fastly::Request (against the migration direction) or force compat conversions at every call. Defer until those callers also migrate.

Findings left for the author / privacy

These three weren't auto-implemented because they need a decision (policy or scope) rather than a code change. Each is laid out below with the evidence, the impact, and the options.

🔧 P2-3 — transform_prebid_response mints /ad-proxy/ URLs that don't resolve

Whatprebid.rs:422-502 introduces transform_prebid_response, rewrite_ad_markup, and make_first_party_proxy_url. The function is called unconditionally at parse time (prebid.rs:1138) for every Prebid auction response. It walks the winning bids and rewrites the adm HTML, nurl, and burl of any bid whose markup mentions one of five hardcoded CDN hostnames (cdn.adsrvr.org, ib.adnxs.com, rtb.openx.net, as.casalemedia.com, eus.rubiconproject.com) into URLs of the form https://<request_host>/ad-proxy/....

Why it's broken in this PR — no /ad-proxy/ route exists anywhere in the codebase:

  • adapter-fastly/main.rs routing match arms don't include /ad-proxy/.
  • PrebidIntegration::routes() (prebid.rs:299-313) declares only script_patterns — no proxy endpoint.
  • git grep '/ad-proxy/' returns only the rewriter and its own unit tests.

So every winning bid from those five CDN families serves a creative whose resource loads 404 (image/script/CSS dependencies that were rewritten to /ad-proxy/...), and the nurl/burl win-notification + billing pixels also 404. Bidders that depend on nurl/burl being called will observe zero impressions and may pull back bidding entirely.

Scope concern — this is a regression vs. origin/main (where neither transform_prebid_response nor rewrite_ad_markup exists — confirmed via git show origin/main:.../prebid.rs | grep -c transform_prebid_response returning 0) and doesn't appear in the EC scope of this PR. The unit tests verify the rewriting itself but not that the rewritten URLs are reachable.

Options (need your call):

  1. Drop the rewriter from this PR. Remove transform_prebid_response, rewrite_ad_markup, make_first_party_proxy_url, their two unit tests, and the call site at line 1138. Smallest change; ships the EC PR clean.
  2. Land it properly in a separate PR. Either route /ad-proxy/{type}/{*rest} through the existing /first-party/proxy handler, or add a dedicated handler. Gate the rewriting behind a config flag so the un-routed state is impossible to ship.
  3. Gate the call site by feature flag in this PR. Add if config.first_party_proxy_enabled around the transform_prebid_response call, defaulting to off. Lets the code land without breaking bids, then a follow-up PR enables it once the route exists.

You triaged this as "don't auto-drop", so I left it in place. Strong recommendation is option 1 or 3 — option 2 is correct but doesn't have to block this PR.

❓ P2-4 — Jurisdiction::Unknown fail-closed → silent EC blackhole when geo is unwired

Whatconsent/jurisdiction.rs:48-52 returns Jurisdiction::Unknown whenever GeoInfo is None. consent/mod.rs:515 then returns false from allows_ec_creation for Unknown, with the comment "Fail-closed: block EC creation as a precaution."

Why it matters — EC depends on geo for two things, with very different criticality:

Site What it uses geo for Defensive today?
consent/jurisdiction.rs Decide GDPR / US-state / NonRegulated / Unknown to gate consent NoNoneUnknown → fail-closed
ec/kv_types.rs::KvGeo::from_geo_info Store country / region / asn / dma as KV metadata for downstream analytics YesNonecountry: "ZZ", no decisions blocked

So a misconfigured deployment where the Fastly geo lookup returns None (a new service that didn't enable geo in the dashboard, a local Viceroy run without geo fixtures, or — concerningly — the #[allow(deprecated)] GeoInfo::from_request path at ec/mod.rs:191 starting to fail) will look healthy in every other respect but issue zero EC IDs: no organic generation, /identify returns consent: denied, auction EID decoration is empty. Nothing logs, nothing alerts.

Defensive options (trade-off is compliance safety vs. operability):

Option Behavior change Compliance risk
A. Default NoneNonRegulated One-line flip, EC works without geo High — EU-facing service that loses geo silently issues EC without GDPR consent
B. [consent] default_jurisdiction = "unknown" | "non_regulated" | "gdpr", defaulting to "unknown" Operator picks fail-open vs. fail-closed per deployment None at default; operator-owned otherwise
C. Startup log::warn! when the first request's geo lookup returns None Semantics unchanged; misconfiguration becomes visible in logs None
D. Refuse to start if a bootstrap geo probe returns None Strictest None; breaks Viceroy unless geo is mocked

Recommendation: B + C for production-grade defence, or just C for a minimal, uncontroversial fix. C alone is ~10 lines and changes no behavior — I can add it without waiting for the policy answer if you'd like.

Why this is a question, not an auto-fix — the fail-closed semantics themselves are defensible (don't store identity for a user whose jurisdiction you can't determine). Flipping that without a deliberate decision changes the project's compliance posture.

❓ P2-5 — TCF storage-consent silently overrides US-Privacy opt_out_sale=Yes in US states

Whatconsent/mod.rs:485-510 checks signals for UsState in this order: GPC → TCF → GPP → US-Privacy. Once TCF is found and tcf.has_storage_consent() is true, the function returns true without ever consulting US-Privacy. The unit test ec_allowed_us_state_tcf_takes_priority_over_us_privacy (consent/mod.rs:1087-1103) explicitly encodes and asserts this precedence.

Concrete failure mode — User visits the publisher from the EU first; the CMP writes a TCF consent string into the user's browser. Some time later the same user (or someone with the same ts-ec cookie / shared device) visits from a US state with a privacy law (CA/VA/CO/CT/UT/TX/...) and uses the publisher's CCPA banner to opt out of sale (us_privacy=1YYY, opt_out_sale=Yes). The stale TCF storage consent is still present, so allows_ec_creation returns true and EC continues to operate as if the user hadn't opted out. The explicit, user-issued CCPA opt-out is silently ignored.

The code's rationale (from the inline comment at line 491-494):

When a CMP uses TCF in the US (e.g. Didomi), respect the TCF Purpose 1 decision — this is an explicit opt-in signal. The Sourcepoint GPP design documents this precedence decision.

That argument is reasonable when the TCF and US-Privacy strings come from the same CMP at the same time and represent a coordinated decision. The code does not verify that — any TCF string with storage consent wins over any US-Privacy opt-out, regardless of age or source.

Options:

  1. Keep current precedence — confirm with privacy/legal that "TCF storage consent wins over US-Privacy opt_out_sale=Yes" is the intended semantics across the stale-TCF case. Update the inline comment and test to make the cross-CMP assumption explicit.
  2. Reverse precedence to "explicit US opt-out always wins" — one-line change at consent/mod.rs:494-496: check US-Privacy first; if opt_out_sale == Yes, return false; otherwise fall through to TCF/GPP. Safest from a CCPA-enforcement standpoint.
  3. Require same-CMP correlation — only let TCF override US-Privacy when both strings were issued in the current session / by the configured primary CMP. More complex, would need a new signal source.

Why this is a question, not an auto-fix — precedence between privacy frameworks is a policy decision with legal exposure. I won't flip it without a sign-off, even though the fix is one line.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace (1178 core + 21 openrtb + 20 adapter-fastly + 2 doc, 0 failed)
  • cd crates/js/lib && npx vitest run (318 passed)
  • prettier --check on the modified doc
  • cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1 with the integration-test env vars

aram356 added 3 commits May 17, 2026 10:32
Addresses review finding on #621 (P2-2): handle_auction parsed the
request body with serde_json::from_slice(&req.take_body_bytes()) and
no size limit, so an authenticated client could buffer arbitrary bytes
into the Fastly Compute worker before any check. Reject oversized
bodies with 413 — Content-Length precheck for well-behaved clients,
post-read length check for clients that lie about or omit the header.
Cap is 256 KiB — comfortable headroom for realistic Prebid-derived
auctions. The check is inlined (rather than calling the
content_length_exceeds_limit helper that batch_sync uses) because the
HTTP-types migration in flight in http_util.rs is moving away from
fastly::Request, and a shared helper would either fight that direction
or force compat conversions at every call site.
Addresses review finding on #621 (P2-8): handle_proxy took 7 arguments
including &self and was guarded by #[allow(clippy::too_many_arguments)].
CLAUDE.md disallows >7 arguments — use a struct. Introduce
ProxyDispatchInput bundling method/path/settings/kv/ec_context/req under
one lifetime, destructure inside the method, drop the allow, and update
the four unit-test call sites plus the adapter-fastly dispatch site.
Addresses review nitpicks on #621 (P2-10/11/12/13):
 - drop the dead VALID_SYNTHETIC_ID test constant + "synthetic ID" doc
   comment in test_support.rs (the EC system has no synthetic-ID concept
   any more)
 - rewrite the EcContext::consent_mut doc to stop citing the never-
   implemented /_ts/api/v1/sync endpoint
 - update the EcScenario doc comment to list only the EC endpoints that
   actually exist
 - extend the /identify response example in api-reference.md with
   cluster_size and note which fields are optional
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant