Skip to content

20260518-053619 Review fixes for #680#708

Open
aram356 wants to merge 9 commits into
server-side-ad-templates-implfrom
review/20260518-053619-680
Open

20260518-053619 Review fixes for #680#708
aram356 wants to merge 9 commits into
server-side-ad-templates-implfrom
review/20260518-053619-680

Conversation

@aram356
Copy link
Copy Markdown
Collaborator

@aram356 aram356 commented May 18, 2026

Summary

Implements the actionable findings from the pass-1 review of #680 ("Implement server-side ad slot templates with PBS and APS auction") so the branch is closer to a clean merge candidate. Stacked on top of #680 — merge into that branch to absorb the fixes. Based on the current head e45f458f.

Findings addressed

🔧 Blocking

  • P2-1 — AuctionContext request contract pinned. collect_dispatched_auction and the publisher-side collectors quietly built a fresh fastly::Request::get("https://placeholder.invalid/") per request and stuffed it into AuctionContext.request before invoking the mediator. Works today only because the current mediator doesn't read headers — a future PBS-as-mediator change would silently lose DNT, client IP, UA. Added a thorough doc-comment distinguishing dispatch (real request) from collect (synthetic) on AuctionContext::request, centralised the URL as MEDIATOR_PLACEHOLDER_URL, and added a debug_assert! in make_collect_context so callers that accidentally route a real request through the collect path fail loudly in debug builds. Full snapshot-headers refactor tracked as follow-up.
  • P2-2 — __tsAdInit no longer double-enables GPT services. The inline bootstrap injected at <head> called googletag.pubads().enableSingleRequest() and googletag.enableServices() unconditionally; the publisher's own GPT init code (or the TS bundle's later-installed version) would re-enable, producing duplicate ad requests. Now both inline and bundle converge on __tsServicesEnabled and only call refresh(newSlots) — never the unbounded refresh().
  • P2-3 / P2-16 — HTML stream processor goes through from_settings. create_html_stream_processor built HtmlProcessorConfig inline, bypassing HtmlProcessorConfig::from_settings — any future field added there would silently miss the auction-hold streaming path. Added an HtmlProcessorConfig::with_ad_state(...) builder and routed through from_settings so the canonical builder stays the single source of truth. The previously-unused _settings argument is now actually used.
  • P2-4 — /__ts/page-bids bot / prefetch gate. Same is_bot and is_prefetch short-circuits the publisher path applies. Slots are still returned (HTML structure unchanged) but the auction is skipped, so a crawler or Sec-Purpose=prefetch warm-up can't burn partner SSP request quota. New tests cover both gates.
  • P2-5 / P2-13 — adapter parses creative-opportunities.toml once and falls back instead of panicking. The previous per-request toml::from_str(...).expect("should parse...") would have panicked every request on a malformed embedded TOML (binary patch, future schema change). Now a LazyLock parses once per Wasm instance; on failure it logs the error and falls back to the documented "empty slots file = feature disabled" state.
  • P2-6 — APS provider .take() hardening. ApsAuctionProvider::slot_id_map wasn't cleared after parse_response, leaving stale state attached to the long-lived Arc<dyn AuctionProvider> if Fastly Compute ever reused a Wasm instance across requests. std::mem::take clears the lock now (mirroring what the mock provider already did with bid_index).
  • P2-7 — winning_bids.len() no longer overcounts. apply_floor_prices used to pass price=None bids through; the no-mediator caller already filtered them via select_winning_bids, so the pass-through was dead code that would, if revived, cause winning_bids.len() to overcount what build_bid_map actually ships. Dropped the None branch; the existing test was updated to pin the new contract (callers must decode prices first).

♻️ / 🌱 / ⛏ / 🏕 Non-blocking

  • P2-14 — MatchedSlotsContext. Bundled (matched_slots, request_path, co_config) into a struct so build_auction_request drops below the project's 7-argument cap and leaves headroom for future inputs.
  • P2-15 — Drop dead PrebidIntegrationConfig::suppress_nurl. Field was declared, defaulted, and never read.
  • P2-17 — Extract prepend_auction_debug_comment. Pulled the duplicated debug-comment-prepend logic out of one_behind_loop and the BufferedProcessed arm into a single helper.
  • P2-18 — price_bucket rejects non-finite cpm. (x * 100.0).floor() as u64 had implementation-defined behaviour for NaN/Inf. Now returns "0.00" early; new test pins the contract for every granularity.
  • P2-24 — Lowered noisy log::info! to log::debug! for write_bids_to_state (per-request slot list).
  • P2-25 — Replaced infallible unwrap_or_else on serde_json::to_string(Map/Vec) with expect("…should be infallible") so a future bug doesn't get silently swallowed.
  • P2-26 — Extracted __tsAdInit bootstrap to gpt_bootstrap.js pulled in via include_str!, so edits diff cleanly and get JS syntax highlighting in editors.

Findings deferred

  • P2-8 — Brotli Decompressor::new(body, 4096) buffer-size question. Pure perf / not correctness. Comment-only.
  • P2-9 — RwLock vs Mutex for ad_bids_state. Comment-only.
  • P2-10 — html_escape_for_script 7 sequential String::replace allocations. Pure perf observation.
  • P2-11 — is_bot substring scan is best-effort. Comment-only.
  • P2-12 — DispatchedAuction clones AuctionRequest even without mediator. Minor allocation; pure perf observation.

Findings left for the author

The pass-1 review on #680 also flagged a few items that are non-mechanical:

  • Backward-compat break: PBS bidders.insert(b, json!({})) empty-params fallback was replaced with ImpStoredRequest. Any external caller of /auction who was relying on the old "send empty params to every configured bidder" behaviour is silently broken. Verify intended.
  • Implicit Wasm-isolation assumption is now load-bearing in two places (mediator placeholder, provider Mutex). A central note in CLAUDE.md describing the concurrency model would give future changes a contract to break against.
  • Config bifurcation between trusted-server.toml and creative-opportunities.toml (both include_str!'d, no hot-reload). Worth a follow-up issue.
  • No integration test for one_behind_loop. Chunk-holding logic is only exercised via the higher-level rewriter tests.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace (1008 core + 21 openrtb + 20 adapter-fastly + 3 doc, 0 failed)
  • cd crates/js/lib && npx vitest run (297 passed)
  • Manual reading of the changes against the original review.

aram356 added 6 commits May 17, 2026 22:38
Addresses review findings on #680:
 - P2-18 / price_bucket: reject NaN/Inf cpm up front before the
   (x * 100.0).floor() as u64 cast (Rust's NaN-cast behaviour is
   only safe by convention, not contract). Add test coverage.
 - P2-24 / publisher.rs::write_bids_to_state: drop the per-request
   log line from INFO to DEBUG so production logs don't spam a slot
   list on every page request.
 - P2-25 / publisher.rs::build_bids_script / build_ad_slots_script:
   serde_json::to_string of a Map / Vec is infallible -- use
   expect("should be infallible") instead of unwrap_or_else with a
   silent fallback that would mask any future bug.
 - P2-15 / prebid: drop the dead PrebidIntegrationConfig::suppress_nurl
   field -- declared but never read anywhere in the codebase. The
   no-op test it carried goes with it.
Addresses review findings on #680:
 - P2-2: the inline `__tsAdInit` bootstrap injected at <head> called
   googletag.pubads().enableSingleRequest() and googletag.enableServices()
   unconditionally. The TS bundle's later-installed version guards both
   with a `__tsServicesEnabled` flag — the inline version did not, so
   the publisher's own GPT init code (or an upgrade where the bundle
   loads before the bids script runs) caused double-enable + duplicate
   refresh(), producing duplicate ad requests on every load. Now both
   inline and bundle converge on the same flag and only invoke
   `refresh(newSlots)` for the slots this pass actually defined, never
   the global slot list.
 - P2-26: the bootstrap source moves out of a concat!() literal block
   in head_inserts() into a syntax-highlighted gpt_bootstrap.js file
   pulled in via include_str!. The Rust side keeps a single named
   constant, GPT_BOOTSTRAP_JS, so future edits diff cleanly.

Adds a regression test that asserts the guard flag is present and that
unbounded refresh() is gone.
Addresses review findings on #680:
 - P2-6: APS provider held its per-request slot_id_map across
   request boundaries when the same Wasm instance was reused
   (the mock provider already cleared its bid_index, APS did not).
   parse_response now `std::mem::take`s the map so it can never
   carry over to a subsequent request.
 - P2-7: apply_floor_prices used to silently pass bids with
   `price=None` through the floor filter. Today both production
   callers decode/skip None before calling, so the pass-through
   was dead code that would, if revived, cause winning_bids.len()
   to overcount what build_bid_map ships to the client. Drop the
   None branch and update the existing test to pin the new
   contract: callers must decode prices first.
Addresses review findings on #680:
 - P2-4: /__ts/page-bids ran the full SSP auction for every request
   without gating crawlers or prefetches the way the publisher path
   does, exposing partner request quota to client-side spraying.
   Apply the same is_bot / is_prefetch gate handle_publisher_request
   uses: slots are still returned (so HTML structure is unchanged)
   but the auction is short-circuited. New regression tests cover
   both gates.
 - P2-5 + P2-13: the adapter previously parsed CREATIVE_OPPORTUNITIES_TOML
   on every request and `expect("should parse...")` on failure, so a
   malformed embedded TOML (CI-bypassed binary patch, future schema
   change, anything build.rs didn't catch) would panic every request.
   Parse it lazily once per Wasm instance via LazyLock; on parse
   failure log an error and fall back to the documented "empty slots
   file = feature disabled" state instead of panicking.
Addresses review findings on #680:
 - P2-3 + P2-16: create_html_stream_processor previously built
   HtmlProcessorConfig inline, bypassing HtmlProcessorConfig::from_settings.
   A future edit to from_settings (e.g. to read a new flag from Settings)
   would silently miss the streaming-with-auction-hold path. Now goes
   through from_settings, with a new with_ad_state(...) builder method
   that layers the ad_slots_script / ad_bids_state fields on top. The
   `_settings` argument on create_html_stream_processor is now actually
   used and is no longer underscore-prefixed.
 - P2-17: the auction debug-comment prepend logic was duplicated between
   one_behind_loop (path=stream) and the BufferedProcessed arm
   (path=buffered). Extracted as `prepend_auction_debug_comment(label,
   result, state)` so the single source of truth gets one definition
   and the only difference between paths is the label string.
 - P2-14: build_auction_request was at the project's 7-argument cap.
   Bundled (matched_slots, request_path, co_config) into a new
   MatchedSlotsContext struct — the three fields always travel together
   anyway. The function now takes 5 args, leaving headroom for future
   per-request inputs without breaking the project rule.
Addresses review finding P2-1 on #680: collect_dispatched_auction and the
publisher-side collectors built fastly::Request::get("https://placeholder.invalid/")
on the fly and stuffed it into AuctionContext.request before invoking the
mediator. That worked today only because the current mediator
(adserver_mock) does not read request headers. A future PBS-as-mediator
change would silently lose DNT, client IP, UA, etc., because the
placeholder has none of them.

The structural fix that surfaces the contract:
 - AuctionContext::request now carries a thorough doc-comment that
   explicitly distinguishes the dispatch path (real client request, all
   headers available) from the collect path (synthetic placeholder, no
   client headers — mediators MUST snapshot at dispatch time if they
   need them).
 - MEDIATOR_PLACEHOLDER_URL is a single named const so the three inline
   `"https://placeholder.invalid/"` literals across orchestrator.rs and
   publisher.rs converge on one source of truth. Tests / debug-asserts
   can compare against it.
 - make_collect_context now debug_asserts that the placeholder argument
   matches the canonical URL, so any future caller that accidentally
   forwards a real client request through the collect path fails loudly
   in debug builds instead of triggering an invisible header-loss bug.

A full snapshot-headers refactor (so mediators can read real client
headers across the await) is tracked as follow-up; this PR makes the
existing contract impossible to violate accidentally.
aram356 added 3 commits May 18, 2026 01:04
Addresses pass-2 review finding P3-1 on #680: handle_auction read the
request's cookie jar for consent purposes but never threaded it into
the AuctionRequest, so the Extended User IDs from the `ts-eids` cookie
were dropped on the floor. The publisher-page and /__ts/page-bids paths
both call parse_ts_eids_cookie(cookie_jar.as_ref()); the /auction
endpoint now does the same so programmatic callers (slim-Prebid, native
apps, server-to-server integrations) get parity instead of silently
losing identity data.
Addresses pass-2 review findings on #680:

 - Dedupe the bot-UA and prefetch checks. handle_publisher_request and
   handle_page_bids both inlined the same 5-entry crawler list and the
   sec-purpose/purpose header check. Promoted to two pub(crate) helpers
   in publisher.rs (is_bot_user_agent, is_prefetch_request) backed by a
   single BOT_USER_AGENT_FRAGMENTS const, so the two paths can't drift.

 - Drop the dead gam_network_id argument on
   CreativeOpportunitySlot::to_ad_slot. The parameter was already being
   discarded via `let _ = gam_network_id;`. Removing it also drops the
   now-unused co_config field from MatchedSlotsContext.

 - Pre-compile slot glob patterns once per Wasm instance. Each call to
   matches_path previously ran Pattern::new (per pattern, per slot, per
   request). The slots live in the adapter's LazyLock-cached
   CreativeOpportunitiesFile, so adding a #[serde(skip)] compiled_patterns
   cache populated by CreativeOpportunitiesFile::compile() at file-load
   time turns matches_path into a Vec<Pattern>::iter().any() lookup on
   the hot path. The fallback (compile-per-call) is preserved for
   hand-built slots in tests. Two new regression tests pin the cache.
@aram356 aram356 requested a review from prk-Jr May 18, 2026 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant