Skip to content

perf(composition): 2.2x faster federation via allocation cuts and hand-rolled AST walks#2962

Draft
jensneuse wants to merge 1 commit into
mainfrom
improve-composition
Draft

perf(composition): 2.2x faster federation via allocation cuts and hand-rolled AST walks#2962
jensneuse wants to merge 1 commit into
mainfrom
improve-composition

Conversation

@jensneuse

@jensneuse jensneuse commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

Optimizes federateSubgraphs to be 2.22x faster (federate phase, 895.6 ms → 403.9 ms on a 150-subgraph / 1.4 MB scenario; 1.74x end to end) while keeping the public API and all composition outputs byte-identical. Every change was gated on the full test suite plus a byte-exact golden-output check of the benchmark scenario, and independently adversarially reviewed.

Phase Before After Speedup
federateSubgraphs 895.6 ms 403.9 ms 2.22x
End to end (parse + federate + print) 1169.8 ms 671.6 ms 1.74x

Measured with the included paired A/B runner (interleaved subprocess runs of both builds under identical machine conditions, median of 5 pairs, Node 24).

What changed

Deferred/lazy work

  • NormalizationResult.schema / .subgraphString, federatedGraphClientSchema, and SubgraphConfig.schema are now cached lazy getters — eliminates one buildASTSchema and one print(newAST) of every subgraph per composition (the printed string had zero consumers in the hot path)
  • Error objects for inaccessible required input values are constructed only when they actually surface (previously allocated with stack capture on the success path)

Traversal

  • Hand-rolled, order-identical AST walkers replace graphql-js visit() for the two normalization passes, renameRootTypes, and the four field-set validators (visit() was 27.5% of self time). The walkers replicate QueryDocumentKeys visitation order, return false subtree skips, and BREAK semantics exactly — verified by differential fuzzing against visit() (4,000 documents)

Caching

  • Run-scoped field-set cache storing only parse errors + normalized strings; every visiting consumer keeps its own fresh parse so raw document order is preserved
  • Memoized validateImplicitFieldSets; WeakMap-cached directive/default-value prints

Allocation cuts (~800k fewer Map/Set allocations per 150-subgraph composition)

  • Adopt-don't-copy federation merge for types defined in exactly one subgraph (97% of parents in the bench scenario skip the deep-copy path)
  • FieldData/FederatedDirectivesData containers that were empty in 98–100% of instances are now lazily allocated
  • Edge.visitedIndices Set<number> → integer watermark; copy-on-write NodeResolutionData; lazy GraphNode containers
  • Profile-guided micro-optimizations in hot upsert/copy/graph paths

Hardening (found by differential fuzzing during review)

Brace-escaped field sets such as @key(fields: "id} scalar S @d(a: 1) #") previously either crashed composition with an uncaught TypeError or silently shipped corrupted keys into the router config. safeParseFieldSet now rejects field-set documents that parse to anything other than a single operation definition with a regular composition error.

Benchmark harness (composition/bench/)

  • gen.mjs — deterministic scenario generator (150 subgraphs, ~1.4 MB SDL, composes cleanly); regenerates the scenario byte-for-byte, so the 1.8 MB fixture directory is gitignored
  • bench.mjs — per-phase timings (parse / federate / emit) with a byte-exact golden gate (golden.graphql) so a perf change can never silently change semantics
  • bench-ab.mjs — paired A/B runner comparing two builds under identical load
node composition/bench/gen.mjs
pnpm --filter @wundergraph/composition build
node composition/bench/bench.mjs

Breaking-change note (types only)

Members of the exported FieldData / FederatedDirectivesData types that are now lazily allocated became optional (?). This is a type-level breaking change for external consumers that read them without guards — no consumer in this monorepo does. Worth a semver signal in release notes.

Test plan

  • Full composition test suite passes (38 files / 1106 tests, 18 skipped)
  • Private real-customer composition suites pass (73 files / 1142 tests total incl. ebay/walmart/adobe/nerdwallet/… schemas; not committed by design)
  • Byte-exact golden supergraph SDL on the 150-subgraph benchmark scenario after every individual optimization
  • Differential fuzzing of the field-set walker vs graphql-js visit() (4,000 random documents, BREAK/false injection): zero divergence
  • Adversarial code review of every optimization round (two earlier attempts were rejected and reworked: an AST-mutating cache and a walker edge case on malformed field sets)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added benchmarking infrastructure and A/B comparison tools for performance evaluation.
  • Bug Fixes

    • Fixed error handling for undefined field properties in certain error scenarios.
  • Performance & Optimizations

    • Optimized memory allocation patterns and resource usage throughout the composition engine.
    • Implemented caching mechanisms to reduce redundant computations.
    • Enhanced traversal efficiency for schema analysis and resolution.

…d-rolled AST walks

Optimizes federateSubgraphs end to end while keeping the public API and all
composition outputs byte-identical (full test suite plus a byte-exact
150-subgraph golden check pass on every change):

- Lazy, memoized computation of NormalizationResult.schema/subgraphString,
  federatedGraphClientSchema, and SubgraphConfig.schema (eliminates one
  buildASTSchema + print of every subgraph AST per composition)
- Hand-rolled, order-identical AST walkers replacing graphql-js visit() for
  the two normalization passes, renameRootTypes, and the four field-set
  validators (visit() was 27.5% of self time; replicates QueryDocumentKeys
  order, return-false subtree skips, and BREAK semantics exactly)
- Run-scoped field-set cache (parse errors + normalized strings only; every
  visiting consumer keeps its own fresh parse in raw document order)
- Adopt-don't-copy federation merge for types defined in exactly one
  subgraph (97% of parents in the bench scenario skip deep copies)
- Allocation diet: lazily allocated FieldData/FederatedDirectivesData
  containers that are empty in 98-100% of instances, copy-on-write
  NodeResolutionData, Edge.visitedIndices Set replaced with an integer
  watermark, lazy GraphNode containers (~800k fewer Map/Set allocations
  per 150-subgraph composition)
- Deferred Error construction off the success path; memoized
  validateImplicitFieldSets; WeakMap-cached directive/default-value prints
- Profile-guided micro-optimizations in hot upsert/copy/graph paths

Hardening found by differential fuzzing during review: brace-escaped field
sets (e.g. fields: "id} scalar S @d(a: 1) #") previously crashed composition
with an uncaught TypeError or silently shipped corrupted keys into the router
config; safeParseFieldSet now rejects multi-definition field-set documents
with a regular composition error.

Adds composition/bench: a deterministic 150-subgraph/1.4MB benchmark
(gen.mjs regenerates the scenario byte-for-byte) with per-phase timings, a
byte-exact golden gate, and a paired A/B runner for load-immune comparisons.

Benchmark (Node 24, paired A/B medians): federateSubgraphs 895.6ms -> 403.9ms
(2.22x), end to end incl. parse + print 1169.8ms -> 671.6ms (1.74x).

Note: members of the exported FieldData/FederatedDirectivesData types that
are now lazily allocated became optional - a type-level breaking change for
external consumers that read them without guards.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This PR optimizes the composition library's memory usage and performance through lazy allocation patterns, replaces graphql.visit calls with custom traversal utilities, defers expensive operations like schema construction, and introduces caching for field-set validation. It also adds comprehensive benchmarking infrastructure to measure federation throughput across builds.

Changes

Benchmarking Infrastructure

Layer / File(s) Summary
Benchmark scenario generation and configuration
composition/bench/.gitignore, composition/bench/gen.mjs
Scenario generator produces deterministic SDL with controlled federation complexity via environment-variable knobs; gitignore ignores generated scenario artifacts.
Benchmark execution and measurement
composition/bench/bench.mjs, composition/bench/bench-ab.mjs
Single-run and paired benchmark runners measure parse/federate/emit phases, compare against golden SDL, compute medians, and output metrics to stdout.

Composition System Optimization

Layer / File(s) Summary
Field-set document traversal utility
composition/src/ast/field-set-walker.ts, composition/src/ast/utils.ts
Custom FieldSetVisitor and walkFieldSetDocument traverse GraphQL field-set documents with enter/leave hooks; safeParseFieldSet wraps and validates field-set parsing.
Type system updates for lazy allocation
composition/src/schema-building/types/types.ts, composition/src/schema-building/utils.ts
FieldData and FederatedDirectivesData maps/sets become optional to support lazy allocation; federated directives are initialized as undefined instead of empty maps.
Utility and error handling hardening
composition/src/utils/utils.ts, composition/src/errors/errors.ts, composition/src/schema-building/utils.ts
add() detects insertion via size change; getFirstEntry() accepts undefined; nullable maps are iterated safely; cachedPrint caches AST node printing results.
Resolvability graph edge and node optimization
composition/src/resolvability-graph/graph-nodes.ts, composition/src/resolvability-graph/graph.ts
Edge tracking replaced with compact lastVisitedIndex: number; GraphNode collections become optional for lazy allocation; sibling-node precomputation optimizes entity-edge generation.
NodeResolutionData copy-on-write optimization
composition/src/resolvability-graph/node-resolution-data/node-resolution-data.ts
Copy-on-write semantics via #isShared flag and #materialize() method; copy() shares set references initially, clones only on first mutation.
Walker edge visit tracking
composition/src/resolvability-graph/walker/entity-walker/entity-walker.ts, composition/src/resolvability-graph/walker/root-field-walkers/root-field-walker.ts
Walkers use edge.lastVisitedIndex === this.index guard for cycle prevention; cache management refactored to use explicit Map operations and NodeResolutionData.copy().
Field-set caching and batch normalization
composition/src/v1/normalization/batch-normalization/batch-normalizer.ts, composition/src/v1/normalization/types/types.ts, composition/src/v1/normalization/types/params.ts
Introduces FieldSetCacheEntry type and shared fieldSetCacheByRawFieldSet map threading through batch and factory entrypoints to cache parsed/normalized field sets.
Normalization factory field-set traversal and lazy schema
composition/src/v1/normalization/normalization-factory.ts
Field-set documents traversed via walkFieldSetDocument instead of graphql.visit; parse results cached; schema and subgraph string construction deferred via cached variables; loops hardened against undefined maps.
Normalization walkers explicit definition iteration
composition/src/v1/normalization/walkers.ts
upsertDirectiveSchemaAndEntityDefinitions and upsertParentsAndChildren refactored from visit() to explicit definition iteration with kind-based switches; visit() retained only for nested directives in executable definitions.
Key field-set validation using field-set walker
composition/src/v1/normalization/utils.ts
validateKeyFieldSets refactored from graphql.visit() to walkFieldSetDocument callbacks; validation logic preserved; satisfiedFieldSets and externalFieldSets lazily initialized before recording.
Federation walkers explicit definition iteration
composition/src/v1/federation/walkers.ts
renameRootTypes refactored from visit() to explicit iteration by Kind; visitFieldDefinitionNode helper applies per-field logic.
Federation factory caching and adoption path
composition/src/v1/federation/federation-factory.ts, composition/src/v1/federation/utils.ts
Implicit-key-field-set validation caching; deferred inaccessibility error construction; adoption methods for single-subgraph parents; lazy client schema construction; hardened loops against missing maps.
Schema memoization and lazy initialization
composition/src/subgraph/utils.ts
InternalSubgraph.schema deferred via memoized getter instead of eager assignment.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes


Possibly related PRs

  • wundergraph/cosmo#2759: Modifies composition/src/resolvability-graph/graph.ts around getEntityAncestorCollection and satisfiedFieldSets shared logic, directly connected at the code-path level.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately summarizes the main performance optimization objective—a 2.2x speedup through allocation cuts and hand-rolled AST walks—which is the primary focus of the extensive changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 34.89%. Comparing base (746c876) to head (7f53f19).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2962       +/-   ##
===========================================
- Coverage   65.56%   34.89%   -30.68%     
===========================================
  Files         327      790      +463     
  Lines       46918   105569    +58651     
  Branches     5250     6373     +1123     
===========================================
+ Hits        30763    36841     +6078     
- Misses      16131    68388    +52257     
- Partials       24      340      +316     

see 463 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
composition/src/v1/normalization/normalization-factory.ts (1)

4350-4387: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Lazy schema and subgraphString now depend on external subgraphAST mutation order.

subgraphAST is returned by reference and stays mutable. Moving schema / subgraphString behind memoized getters makes them order-dependent snapshots: callers can mutate subgraphAST before first access and change what gets built, or mutate it after first access and leave the cached values stale. That is an observable API change even though the property names stayed the same.

Based on PR objectives, "Changes preserve the public API."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/v1/normalization/normalization-factory.ts` around lines 4350
- 4387, The lazy memoized getters for schema and subgraphString (using
lazySchema, lazySubgraphString, get schema() and get subgraphString()) depend on
the mutable returned subgraphAST (newAST) and make the API order-dependent; fix
by computing and storing the values eagerly from newAST at factory creation time
(e.g., set schema = buildASTSchema(newAST, {addInvalidExtensionOrphans: true,
assumeValid: true, assumeValidSDL: true}) and subgraphString = print(newAST))
and remove the memoized getters so the returned schema and subgraphString are
stable snapshots independent of later mutations to subgraphAST.
🧹 Nitpick comments (3)
composition/src/ast/utils.ts (1)

285-297: 💤 Low value

Use template literal instead of string concatenation.

Per coding guidelines, prefer template literals over string concatenation.

♻️ Suggested fix
 export function safeParseFieldSet(rawFieldSet: string): ParseResult {
-  const result = safeParse('{' + rawFieldSet + '}');
+  const result = safeParse(`{${rawFieldSet}}`);
   if (result.error || !result.documentNode) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/ast/utils.ts` around lines 285 - 297, The function
safeParseFieldSet uses string concatenation when calling safeParse ('{' +
rawFieldSet + '}'); update this to use a template literal instead (i.e.
`{${rawFieldSet}}`) so the call becomes safeParse(`{${rawFieldSet}}`); keep the
same surrounding logic and return behavior in safeParseFieldSet and do not
modify safeParse itself.

Source: Coding guidelines

composition/src/v1/normalization/walkers.ts (1)

57-117: ⚡ Quick win

Add explicit return types to the new walker helpers.

These helpers are newly introduced TypeScript functions without return annotations, which drifts from the repo rule and makes future traversal changes easier to mis-infer.

As per coding guidelines, "Use explicit type annotations for function parameters and return types in TypeScript."

Also applies to: 273-486

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/v1/normalization/walkers.ts` around lines 57 - 117, The new
walker helper functions (e.g., visitDirectiveNode, visitDirectiveNodes,
visitCompositeOutputDirectiveNodes, visitOperationTypeDefinitionNode and the
other helpers introduced around lines 273-486) are missing explicit return type
annotations; update each function signature to include an explicit return type
(use : void for functions that do not return a value) while preserving existing
parameter types (e.g., nf: NormalizationFactory, node: DirectiveNode, nodes:
ReadonlyArray<ConstDirectiveNode> | undefined, etc.) so TypeScript cannot infer
an unintended return type.

Source: Coding guidelines

composition/src/v1/normalization/types/types.ts (1)

41-46: ⚡ Quick win

Prefer an interface for the new cache-entry shape.

FieldSetCacheEntry is a plain object contract, so this should follow the repo’s interface convention for object shapes.

♻️ Suggested change
-export type FieldSetCacheEntry = {
+export interface FieldSetCacheEntry {
   error?: Error;
   normalizedFieldSet?: string;
-};
+}

As per coding guidelines, "Prefer interfaces over type aliases for object shapes in TypeScript."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/v1/normalization/types/types.ts` around lines 41 - 46,
Replace the exported type alias FieldSetCacheEntry with an exported interface of
the same name that declares the identical optional properties (error?: Error;
normalizedFieldSet?: string;), preserving the existing comment and exports so
call sites are unaffected; this converts the plain object shape to the repo's
preferred interface convention without changing property names or optionality.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@composition/bench/bench-ab.mjs`:
- Line 25: PAIRS is derived with const PAIRS = Number(pairsArg ?? 5) but isn’t
validated, allowing 0, negative, or NaN values; update the parsing/validation
where PAIRS (and the related parsing at lines handling pairsArg around the
second occurrence) is set to ensure it becomes a positive integer: convert
pairsArg to a number, check Number.isInteger and > 0, and if invalid fall back
to the default (5) or throw a clear error; adjust the code paths that use PAIRS
(the PAIRS constant and the parsing logic around the other pairsArg usage at
lines ~54-55) so they rely on the validated value.
- Around line 38-44: The current metric parser (the get arrow function used in
bench-ab.mjs to extract TOTAL_MEDIAN_MS, PARSE_MEDIAN_MS, FEDERATE_MEDIAN_MS,
EMIT_MEDIAN_MS and similarly at the other sites around lines 66–73) silently
returns NaN when the regex fails; update get to validate the regex match and
numeric parse and immediately throw a descriptive Error (including the metric
key and the raw child stdout) if no match or Number(...) is NaN so the process
fails fast when child output format changes or is partial; replace the existing
Number(out.match(...)) usage with this validated logic wherever get is used.

In `@composition/bench/bench.mjs`:
- Around line 92-99: Validate numeric env-derived loop counts before running
benchmarks: ensure BENCH_WARMUP and the iterations variable are parsed to
integers and coerced to a sensible minimum (e.g., >=1) and handle NaN; after
parsing, use these sanitized values for WARMUP and the iterations loop and guard
any use of warm (result of runOnce()) before accessing warm.success or
warm.errors so we never dereference undefined. Update the bootstrap that sets
WARMUP and the iterations (the code that currently uses
Number(process.env.BENCH_WARMUP ?? 2) and the analogous iterations parsing
around runOnce()/median calculation) to validate/coerce, log/exit on invalid
config if desired, and skip/adjust loops accordingly so medians are computed
only with positive counts.
- Around line 74-76: The golden-check currently runs even when SDL emission is
intentionally disabled; update the golden comparison to skip when emission was
skipped. Concretely, change the logic around evaluating warm.sdl / the golden
check (the block that respects process.env.BENCH_SKIP_GOLDEN) to also skip
comparison if process.env.BENCH_NO_EMIT === '1' or if the local sdl is falsy
(the sdl computed from result.federatedGraphSchema in the result.success
branch). Ensure you only perform the golden diff when an SDL was emitted (sdl
exists) and BENCH_SKIP_GOLDEN !== '1'.

In `@composition/bench/gen.mjs`:
- Around line 53-73: Validate all benchmark knob env vars (SUBGRAPHS,
SHARED_ENTITIES, OWNER_FIELDS, EXTENDER_FIELDS, REQUIRES_FIELDS, CHAIN_DEPTH,
CHAIN_WIDTH, IFACE_IMPLEMENTERS, IFACE_FIELDS, UNION_MEMBERS, VALUE_FIELDS)
immediately after parsing: ensure each is a finite integer >= 1 (or another
sensible minimum where applicable), replace NaN/0/negative values with a safe
default or throw a clear error, and use the validated values for subsequent loop
bounds and modulo ops (these vars are referenced later in generation logic such
as the loops/modulo uses around the generation of entity names and chains).
Implement this in gen.mjs right after the Number(...) assignments so all
downstream code uses validated/clamped integers and cannot produce degenerate
SDL or runtime exceptions.

In `@composition/src/schema-building/utils.ts`:
- Around line 227-229: The comparison currently uses loose equality; update the
if in the merge logic to use strict equality (===) when comparing the
cachedPrint results: replace the `if (existingDefaultValueString ==
incomingDefaultValueString)` check so it uses `===` to compare
existingDefaultValueString and incomingDefaultValueString (values produced by
cachedPrint from existingData.defaultValue and incomingData.defaultValue).

In `@composition/src/v1/federation/utils.ts`:
- Line 1: Prettier formatting is failing for files including the import line
that references BREAK, ConstDirectiveNode, Kind, and StringValueNode; run the
formatter (e.g., execute "prettier --write .") to auto-format the repository,
ensure the import statement and surrounding code in the module containing those
symbols are reformatted, then stage and commit the resulting changes so CI
passes.

In `@composition/src/v1/normalization/walkers.ts`:
- Around line 125-130: In the Kind.DIRECTIVE_DEFINITION case, also record
directive usages that appear on the directive definition's arguments so they
contribute to referencedDirectiveNames; after
nf.addDirectiveDefinitionDataByNode(...) (and/or
nf.customDirectiveDefinitionByName.set(...)) iterate definitionNode.arguments
(if any) and for each argument's directives add their .name.value into
nf.referencedDirectiveNames (or call the existing normalizer method that records
a referenced directive name) so these directive usages are tracked even though
argument nodes themselves are not otherwise visited.

---

Outside diff comments:
In `@composition/src/v1/normalization/normalization-factory.ts`:
- Around line 4350-4387: The lazy memoized getters for schema and subgraphString
(using lazySchema, lazySubgraphString, get schema() and get subgraphString())
depend on the mutable returned subgraphAST (newAST) and make the API
order-dependent; fix by computing and storing the values eagerly from newAST at
factory creation time (e.g., set schema = buildASTSchema(newAST,
{addInvalidExtensionOrphans: true, assumeValid: true, assumeValidSDL: true}) and
subgraphString = print(newAST)) and remove the memoized getters so the returned
schema and subgraphString are stable snapshots independent of later mutations to
subgraphAST.

---

Nitpick comments:
In `@composition/src/ast/utils.ts`:
- Around line 285-297: The function safeParseFieldSet uses string concatenation
when calling safeParse ('{' + rawFieldSet + '}'); update this to use a template
literal instead (i.e. `{${rawFieldSet}}`) so the call becomes
safeParse(`{${rawFieldSet}}`); keep the same surrounding logic and return
behavior in safeParseFieldSet and do not modify safeParse itself.

In `@composition/src/v1/normalization/types/types.ts`:
- Around line 41-46: Replace the exported type alias FieldSetCacheEntry with an
exported interface of the same name that declares the identical optional
properties (error?: Error; normalizedFieldSet?: string;), preserving the
existing comment and exports so call sites are unaffected; this converts the
plain object shape to the repo's preferred interface convention without changing
property names or optionality.

In `@composition/src/v1/normalization/walkers.ts`:
- Around line 57-117: The new walker helper functions (e.g., visitDirectiveNode,
visitDirectiveNodes, visitCompositeOutputDirectiveNodes,
visitOperationTypeDefinitionNode and the other helpers introduced around lines
273-486) are missing explicit return type annotations; update each function
signature to include an explicit return type (use : void for functions that do
not return a value) while preserving existing parameter types (e.g., nf:
NormalizationFactory, node: DirectiveNode, nodes:
ReadonlyArray<ConstDirectiveNode> | undefined, etc.) so TypeScript cannot infer
an unintended return type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a46178f-ebbf-419a-9b6f-e16d417a3737

📥 Commits

Reviewing files that changed from the base of the PR and between f2f5502 and 7f53f19.

📒 Files selected for processing (26)
  • composition/bench/.gitignore
  • composition/bench/bench-ab.mjs
  • composition/bench/bench.mjs
  • composition/bench/gen.mjs
  • composition/bench/golden.graphql
  • composition/src/ast/field-set-walker.ts
  • composition/src/ast/utils.ts
  • composition/src/errors/errors.ts
  • composition/src/resolvability-graph/graph-nodes.ts
  • composition/src/resolvability-graph/graph.ts
  • composition/src/resolvability-graph/node-resolution-data/node-resolution-data.ts
  • composition/src/resolvability-graph/walker/entity-walker/entity-walker.ts
  • composition/src/resolvability-graph/walker/root-field-walkers/root-field-walker.ts
  • composition/src/schema-building/types/types.ts
  • composition/src/schema-building/utils.ts
  • composition/src/subgraph/utils.ts
  • composition/src/utils/utils.ts
  • composition/src/v1/federation/federation-factory.ts
  • composition/src/v1/federation/utils.ts
  • composition/src/v1/federation/walkers.ts
  • composition/src/v1/normalization/batch-normalization/batch-normalizer.ts
  • composition/src/v1/normalization/normalization-factory.ts
  • composition/src/v1/normalization/types/params.ts
  • composition/src/v1/normalization/types/types.ts
  • composition/src/v1/normalization/utils.ts
  • composition/src/v1/normalization/walkers.ts

}
const distA = resolve(distAArg);
const distB = resolve(distBArg);
const PAIRS = Number(pairsArg ?? 5);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate pairs as a positive integer.

Line 25 currently accepts 0, negatives, and NaN, which makes the run summary non-actionable.

Suggested fix
 const PAIRS = Number(pairsArg ?? 5);
+if (!Number.isInteger(PAIRS) || PAIRS <= 0) {
+  console.error('pairs must be a positive integer');
+  process.exit(1);
+}

Also applies to: 54-55

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/bench/bench-ab.mjs` at line 25, PAIRS is derived with const PAIRS
= Number(pairsArg ?? 5) but isn’t validated, allowing 0, negative, or NaN
values; update the parsing/validation where PAIRS (and the related parsing at
lines handling pairsArg around the second occurrence) is set to ensure it
becomes a positive integer: convert pairsArg to a number, check Number.isInteger
and > 0, and if invalid fall back to the default (5) or throw a clear error;
adjust the code paths that use PAIRS (the PAIRS constant and the parsing logic
around the other pairsArg usage at lines ~54-55) so they rely on the validated
value.

Comment on lines +38 to +44
const get = (k) => Number(out.match(new RegExp(`${k}=([\\d.]+)`))?.[1]);
return {
total: get('TOTAL_MEDIAN_MS'),
parse: get('PARSE_MEDIAN_MS'),
federate: get('FEDERATE_MEDIAN_MS'),
emit: get('EMIT_MEDIAN_MS'),
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast if a benchmark metric cannot be parsed from child output.

Line 38 can return NaN when stdout format changes or a child run is partial; downstream ratios/medians then silently degrade to NaN and pollute A/B results.

Suggested fix
-const get = (k) => Number(out.match(new RegExp(`${k}=([\\d.]+)`))?.[1]);
+const get = (k) => {
+  const value = Number(out.match(new RegExp(`${k}=([\\d.]+)`))?.[1]);
+  if (!Number.isFinite(value)) {
+    throw new Error(`Missing/invalid metric "${k}" in bench output`);
+  }
+  return value;
+};

Also applies to: 66-67, 70-73

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/bench/bench-ab.mjs` around lines 38 - 44, The current metric
parser (the get arrow function used in bench-ab.mjs to extract TOTAL_MEDIAN_MS,
PARSE_MEDIAN_MS, FEDERATE_MEDIAN_MS, EMIT_MEDIAN_MS and similarly at the other
sites around lines 66–73) silently returns NaN when the regex fails; update get
to validate the regex match and numeric parse and immediately throw a
descriptive Error (including the metric key and the raw child stdout) if no
match or Number(...) is NaN so the process fails fast when child output format
changes or is partial; replace the existing Number(out.match(...)) usage with
this validated logic wherever get is used.

Comment on lines +74 to +76
if (result.success && process.env.BENCH_NO_EMIT !== '1') {
sdl = schemaToSortedNormalizedString(result.federatedGraphSchema);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Golden check currently conflicts with BENCH_NO_EMIT=1.

Line 74–76 can intentionally skip SDL emission, but Line 107–116 still compares warm.sdl against golden unless BENCH_SKIP_GOLDEN=1. That makes runs fail by configuration, not by semantic mismatch.

Suggested fix
- if (process.env.BENCH_SKIP_GOLDEN !== '1') {
+ const shouldEmit = process.env.BENCH_NO_EMIT !== '1';
+ if (process.env.BENCH_SKIP_GOLDEN !== '1' && shouldEmit) {
    const golden = normalizeString(readFileSync(GOLDEN_PATH, 'utf8'));
    if (warm.sdl !== golden) {
      ...
    }
    console.error('Golden output check: PASS');
  }

Also applies to: 107-116

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/bench/bench.mjs` around lines 74 - 76, The golden-check currently
runs even when SDL emission is intentionally disabled; update the golden
comparison to skip when emission was skipped. Concretely, change the logic
around evaluating warm.sdl / the golden check (the block that respects
process.env.BENCH_SKIP_GOLDEN) to also skip comparison if
process.env.BENCH_NO_EMIT === '1' or if the local sdl is falsy (the sdl computed
from result.federatedGraphSchema in the result.success branch). Ensure you only
perform the golden diff when an SDL was emitted (sdl exists) and
BENCH_SKIP_GOLDEN !== '1'.

Comment on lines +92 to +99
const WARMUP = Number(process.env.BENCH_WARMUP ?? 2);
let warm;
for (let i = 0; i < WARMUP; i++) warm = runOnce();
if (!warm.success) {
const errors = warm.errors;
console.error(`Composition FAILED with ${errors.length} errors. First:\n${errors[0]?.message}`);
process.exit(1);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate warmup/iteration counts before executing benchmark loops.

Line 92 and Line 119 accept unvalidated numeric env values. BENCH_WARMUP=0 leaves warm undefined (Line 95 crash), and non-positive/NaN iterations produce invalid medians.

Suggested fix
+const readNonNegativeInt = (name, fallback) => {
+  const raw = process.env[name];
+  const value = raw === undefined ? fallback : Number(raw);
+  if (!Number.isInteger(value) || value < 0) {
+    throw new Error(`${name} must be a non-negative integer (received: ${raw ?? fallback})`);
+  }
+  return value;
+};
-const WARMUP = Number(process.env.BENCH_WARMUP ?? 2);
+const WARMUP = readNonNegativeInt('BENCH_WARMUP', 2);
...
-const ITERATIONS = Number(process.env.BENCH_ITERATIONS ?? 7);
+const ITERATIONS = readNonNegativeInt('BENCH_ITERATIONS', 7);
+if (WARMUP === 0) {
+  throw new Error('BENCH_WARMUP must be >= 1');
+}
+if (ITERATIONS === 0) {
+  throw new Error('BENCH_ITERATIONS must be >= 1');
+}

Also applies to: 119-138

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/bench/bench.mjs` around lines 92 - 99, Validate numeric
env-derived loop counts before running benchmarks: ensure BENCH_WARMUP and the
iterations variable are parsed to integers and coerced to a sensible minimum
(e.g., >=1) and handle NaN; after parsing, use these sanitized values for WARMUP
and the iterations loop and guard any use of warm (result of runOnce()) before
accessing warm.success or warm.errors so we never dereference undefined. Update
the bootstrap that sets WARMUP and the iterations (the code that currently uses
Number(process.env.BENCH_WARMUP ?? 2) and the analogous iterations parsing
around runOnce()/median calculation) to validate/coerce, log/exit on invalid
config if desired, and skip/adjust loops accordingly so medians are computed
only with positive counts.

Comment thread composition/bench/gen.mjs
Comment on lines +53 to +73
const SUBGRAPHS = Number(process.env.BENCH_SUBGRAPHS ?? 150);
// Shared entities, referenced (consistently) across subgraphs.
const SHARED_ENTITIES = Number(process.env.BENCH_SHARED_ENTITIES ?? 150);
// Base fields the OWNER subgraph declares on each entity.
const OWNER_FIELDS = Number(process.env.BENCH_OWNER_FIELDS ?? 28);
// Extra fields the EXTENDER subgraph declares on each entity (disjoint set).
const EXTENDER_FIELDS = Number(process.env.BENCH_EXTENDER_FIELDS ?? 10);
// How many @external owner fields the extender pulls in for @requires.
const REQUIRES_FIELDS = Number(process.env.BENCH_REQUIRES_FIELDS ?? 3);
// Depth of the per-subgraph nested value-type chain.
const CHAIN_DEPTH = Number(process.env.BENCH_CHAIN_DEPTH ?? 16);
// Width (fields) of each nested object in the chain.
const CHAIN_WIDTH = Number(process.env.BENCH_CHAIN_WIDTH ?? 8);
// Objects implementing each interface, per subgraph.
const IFACE_IMPLEMENTERS = Number(process.env.BENCH_IFACE_IMPLEMENTERS ?? 8);
// Interface field count.
const IFACE_FIELDS = Number(process.env.BENCH_IFACE_FIELDS ?? 8);
// Members in each union, per subgraph.
const UNION_MEMBERS = Number(process.env.BENCH_UNION_MEMBERS ?? 8);
// Fields on the wide shared (@shareable) value type.
const VALUE_FIELDS = Number(process.env.BENCH_VALUE_FIELDS ?? 18);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate benchmark knob env vars before generation.

Line 53–73 values are used as loop bounds and modulo divisors (Line 82, Line 85, Line 191) without validation. Invalid values (0, negative, NaN) can generate broken/degenerate SDL (e.g., EntityNaN) and make benchmark runs misleading.

Suggested fix
+const readPositiveInt = (name, fallback) => {
+  const raw = process.env[name];
+  const value = raw === undefined ? fallback : Number(raw);
+  if (!Number.isInteger(value) || value <= 0) {
+    throw new Error(`${name} must be a positive integer (received: ${raw ?? fallback})`);
+  }
+  return value;
+};
-const SUBGRAPHS = Number(process.env.BENCH_SUBGRAPHS ?? 150);
+const SUBGRAPHS = readPositiveInt('BENCH_SUBGRAPHS', 150);
-const SHARED_ENTITIES = Number(process.env.BENCH_SHARED_ENTITIES ?? 150);
+const SHARED_ENTITIES = readPositiveInt('BENCH_SHARED_ENTITIES', 150);
...

Also applies to: 82-85, 191-193

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/bench/gen.mjs` around lines 53 - 73, Validate all benchmark knob
env vars (SUBGRAPHS, SHARED_ENTITIES, OWNER_FIELDS, EXTENDER_FIELDS,
REQUIRES_FIELDS, CHAIN_DEPTH, CHAIN_WIDTH, IFACE_IMPLEMENTERS, IFACE_FIELDS,
UNION_MEMBERS, VALUE_FIELDS) immediately after parsing: ensure each is a finite
integer >= 1 (or another sensible minimum where applicable), replace
NaN/0/negative values with a safe default or throw a clear error, and use the
validated values for subsequent loop bounds and modulo ops (these vars are
referenced later in generation logic such as the loops/modulo uses around the
generation of entity names and chains). Implement this in gen.mjs right after
the Number(...) assignments so all downstream code uses validated/clamped
integers and cannot produce degenerate SDL or runtime exceptions.

Comment on lines +227 to 229
const existingDefaultValueString = cachedPrint(existingData.defaultValue);
const incomingDefaultValueString = cachedPrint(incomingData.defaultValue);
if (existingDefaultValueString == incomingDefaultValueString) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use strict equality for string comparison.

The coding guidelines require using === instead of ==.

-  if (existingDefaultValueString == incomingDefaultValueString) {
+  if (existingDefaultValueString === incomingDefaultValueString) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const existingDefaultValueString = cachedPrint(existingData.defaultValue);
const incomingDefaultValueString = cachedPrint(incomingData.defaultValue);
if (existingDefaultValueString == incomingDefaultValueString) {
const existingDefaultValueString = cachedPrint(existingData.defaultValue);
const incomingDefaultValueString = cachedPrint(incomingData.defaultValue);
if (existingDefaultValueString === incomingDefaultValueString) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/schema-building/utils.ts` around lines 227 - 229, The
comparison currently uses loose equality; update the if in the merge logic to
use strict equality (===) when comparing the cachedPrint results: replace the
`if (existingDefaultValueString == incomingDefaultValueString)` check so it uses
`===` to compare existingDefaultValueString and incomingDefaultValueString
(values produced by cachedPrint from existingData.defaultValue and
incomingData.defaultValue).

Source: Coding guidelines

@@ -1,4 +1,5 @@
import { BREAK, type ConstDirectiveNode, Kind, type StringValueNode, visit } from 'graphql';
import { BREAK, type ConstDirectiveNode, Kind, type StringValueNode } from 'graphql';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix Prettier formatting issues before merge.

The CI pipeline reports that prettier --check . failed for all three files in this cohort. Run prettier --write . to auto-format the code and resolve the style issues.

🧰 Tools
🪛 GitHub Actions: Composition CI / 0_build_test.txt

[error] 1-1: pnpm run --filter composition lint failed because Prettier --check . reported style issues. Command: "prettier --check .". Exit status 1.

🪛 GitHub Actions: Composition CI / build_test

[error] 1-1: prettier --check . failed (exit status 1). Code style issues found in 5 files. Run Prettier with --write to fix.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/v1/federation/utils.ts` at line 1, Prettier formatting is
failing for files including the import line that references BREAK,
ConstDirectiveNode, Kind, and StringValueNode; run the formatter (e.g., execute
"prettier --write .") to auto-format the repository, ensure the import statement
and surrounding code in the module containing those symbols are reformatted,
then stage and commit the resulting changes so CI passes.

Source: Pipeline failures

Comment on lines +125 to +130
case Kind.DIRECTIVE_DEFINITION: {
if (nf.addDirectiveDefinitionDataByNode(definitionNode)) {
nf.customDirectiveDefinitionByName.set(definitionNode.name.value, definitionNode);
}
// Directives on the definition's arguments are intentionally not visited.
break;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# List relevant files
ls -la composition/src/v1/normalization/ || true
ls -la composition/src/v1/normalization/walkers.ts || true
ls -la composition/src/v1/normalization/normalization-factory.ts || true

# Show the directive-definition branch area in walkers.ts
sed -n '1,220p' composition/src/v1/normalization/walkers.ts | nl -ba | sed -n '90,180p'

# Show the helper ranges mentioned (57-117 and 273-486) with line numbers
sed -n '1,180p' composition/src/v1/normalization/walkers.ts | nl -ba | sed -n '40,140p'
sed -n '1,260p' composition/src/v1/normalization/walkers.ts | nl -ba | sed -n '240,360p'
sed -n '260,560p' composition/src/v1/normalization/walkers.ts | nl -ba | sed -n '250,520p'

# Search for visitDirectiveNodes and where argument-level directives might be handled
rg -n "visitDirectiveNodes|DIRECTIVE_DEFINITION|referencedDirectiveNames|customDirectiveDefinitionByName|addDirectiveDefinitionDataByNode" composition/src/v1/normalization/walkers.ts

# Inspect normalization-factory.ts relevant sections about custom/built-in directive definitions
rg -n "referencedDirectiveNames|customDirectiveDefinitionByName|DirectiveDefinition|re-emit|emit.*directive" composition/src/v1/normalization/normalization-factory.ts
sed -n '1,260p' composition/src/v1/normalization/normalization-factory.ts | nl -ba | sed -n '1,220p'
sed -n '220,520p' composition/src/v1/normalization/normalization-factory.ts | nl -ba | sed -n '220,520p'

# Check TypeScript helpers for explicit return types (rough heuristic: function definitions without ": <type>" before "{")
rg -n "function\\s+\\w+\\s*\\(|=\\s*\\(.*\\)\\s*=>|\\)\\s*=>\\s*\\{" composition/src/v1/normalization/walkers.ts

Repository: wundergraph/cosmo

Length of output: 769


🏁 Script executed:

#!/bin/bash
set -euo pipefail

print_range() {
  local file="$1"
  local start="$2"
  local end="$3"
  awk -v s="$start" -v e="$end" '{if(NR>=s && NR<=e) printf "%6d\t%s\n", NR, $0}' "$file"
}

FILE="composition/src/v1/normalization/walkers.ts"
NF_FILE="composition/src/v1/normalization/normalization-factory.ts"

echo "== Sanity: show sizes =="
wc -l "$FILE" "$NF_FILE"

echo "== Snippet around DIRECTIVE_DEFINITION branch =="
print_range "$FILE" 105 150

echo "== Snippet around lines ~57-117 (as referenced in original comment) =="
print_range "$FILE" 45 130

echo "== Snippet around lines ~273-486 (as referenced in original comment) =="
print_range "$FILE" 260 340
print_range "$FILE" 340 420
print_range "$FILE" 420 500

echo "== Find relevant helper/usage locations =="
rg -n "visitDirectiveNodes|DIRECTIVE_DEFINITION|referencedDirectiveNames|customDirectiveDefinitionByName|addDirectiveDefinitionDataByNode|Kind\.DIRECTIVE_DEFINITION" "$FILE" "$NF_FILE"

echo "== Inspect normalization-factory directive definition emission logic =="
rg -n "referencedDirectiveNames|customDirectiveDefinitionByName|DirectiveDefinition|addDirectiveDefinitionDataByNode|emit.*directive|custom directive" "$NF_FILE"
print_range "$NF_FILE" 1 260
print_range "$NF_FILE" 260 520

echo "== Check helper return types: look for functions in walkers.ts without explicit return type annotations =="
# We'll approximate by matching "function name(" or "name =" patterns and showing the signature line.
rg -n --no-heading "function\\s+\\w+\\s*\\(|=>\\s*\\{" "$FILE"
# Additionally, specifically list TS function signature lines that lack ":" return type and contain "=> {"
rg -n --no-heading "=>\\s*\\{\\s*$" "$FILE" | head -n 80

Repository: wundergraph/cosmo

Length of output: 44325


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="composition/src/v1/normalization/walkers.ts"
NF_FILE="composition/src/v1/normalization/normalization-factory.ts"

echo "== walkers.ts: find where InputValueDefinition directives are handled =="
rg -n "InputValueDefinition.*directives|directives.*InputValueDefinition|visitDirectiveNodes\\(nf, .*InputValue|visitInputValueDefinitionNode|extractDirectives\\(" "$FILE"

echo "== walkers.ts: show around upsertParentsAndChildren DIRECTIVE_DEFINITION case (around line ~480-520) =="
awk 'NR>=470 && NR<=540 {print NR "\t" $0}' "$FILE"

echo "== walkers.ts: show around visitor mapping for InputValueDefinition (search for 'InputValueDefinition:' keys) =="
rg -n "InputValueDefinition\\s*:" "$FILE" || true
rg -n "visit\\(definitionNode" "$FILE"
rg -n "visit\\(.*\\,\\s*\\{" "$FILE" | head -n 20
# If there is graphql-js visit visitor block, show around that for input values.
# Try to locate the big visit() call inside upsertParentsAndChildren.
rg -n "walkers\\.ts.*visit\\(" "$FILE" || true
# fallback: locate "visit(definitionNode" occurrences
rg -n "visit\\(definitionNode" "$FILE"
# show area around first one after upsertParentsAndChildren
# use the earliest occurrence after line 488
awk 'NR>=520 && NR<=720 {print NR "\t" $0}' "$FILE"

echo "== normalization-factory.ts: show full `#addDirectiveDefinitionsToDocument` implementation =="
# locate the method region lines around 4055-4090 as seen in earlier output
awk 'NR>=4030 && NR<=4105 {print NR "\t" $0}' "$NF_FILE"

echo "== normalization-factory.ts: check where referencedDirectiveNames is populated =="
rg -n "referencedDirectiveNames" "$NF_FILE" | head -n 50

Repository: wundergraph/cosmo

Length of output: 4064


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="composition/src/v1/normalization/walkers.ts"
NF_FILE="composition/src/v1/normalization/normalization-factory.ts"

echo "== walkers.ts: print around line 220-270 (where extractDirectives at 241/249) =="
awk 'NR>=200 && NR<=270 {printf "%6d\t%s\n", NR, $0}' "$FILE"

echo "== walkers.ts: print around visitInputValueDefinitionNode + enterInputValueDefinitionNode (420-520) =="
awk 'NR>=410 && NR<=520 {printf "%6d\t%s\n", NR, $0}' "$FILE"

echo "== walkers.ts: print around where visit(definitionNode, { ... }) is called (around line 240-320) =="
awk 'NR>=230 && NR<=320 {printf "%6d\t%s\n", NR, $0}' "$FILE"

echo "== normalization-factory.ts: print around `#addDirectiveDefinitionsToDocument` (4030-4100) =="
awk 'NR>=4020 && NR<=4105 {printf "%6d\t%s\n", NR, $0}' "$NF_FILE"

echo "== normalization-factory.ts: locate method signature for `#addDirectiveDefinitionsToDocument` =="
rg -n "`#addDirectiveDefinitionsToDocument`" "$NF_FILE"

Repository: wundergraph/cosmo

Length of output: 16238


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="composition/src/v1/normalization/normalization-factory.ts"

# Find extractDirectives implementation
rg -n "extractDirectives\\(" "$FILE" | head -n 50
rg -n "extractDirectives\\s*\\[" "$FILE" || true
rg -n "extractDirectives\\s*\\:" "$FILE" || true
rg -n "extractDirectives\\(.*\\)\\s*\\{" "$FILE" | head -n 20
rg -n "extractDirectives\\([^\\)]*\\)\\s*\\{" "$FILE" || true

# Show a likely region around the method definition
# We need exact line; use rg for "extractDirectives(" without matches? We'll narrow:
LINE=$(rg -n "extractDirectives\\(" "$FILE" | awk -F: 'NR==1{print $1}' )
echo "First extractDirectives occurrence line: $LINE"
# But better: match method signature by looking for "extractDirectives(" preceded by "#"? or "extractDirectives(" with "extractDirectives(" but in class.
# We'll search for "extractDirectives(node" or "extractDirectives(definitionNode"
rg -n "extractDirectives\\s*\\(node" "$FILE" || true
rg -n "extractDirectives\\s*\\(.*directives" "$FILE" || true

# show around the best candidate: line with "extractDirectives(" within NormalizationFactory class likely.
# We'll list a few lines with "extractDirectives(" and show each with context
rg -n "extractDirectives\\(" "$FILE" | head -n 20 | while read -r l; do echo "occ: $l"; done

# Let's directly grep for "extractDirectives(" then use awk to locate closest "extractDirectives" method definition by looking for previous 5 lines with "extractDirectives" and check for braces.
rg -n "extractDirectives\\(" "$FILE" | head -n 30 | awk -F: '{print $1}' | sort -n | head -n 5

# show around line ~?? if we get candidates
for candidate in $(rg -n "extractDirectives\\(" "$FILE" | awk -F: '{print $1}' | sort -n | head -n 10); do
  echo "---- candidate $candidate ----"
  awk -v s=$candidate-5 -v e=$candidate+25 'NR>=s && NR<=e {printf "%6d\t%s\n", NR, $0}' "$FILE"
done

# Find addInputValueDataByNode implementation and see whether it extracts directives and updates referencedDirectiveNames
rg -n "addInputValueDataByNode\\s*\\(" "$FILE"
rg -n "addInputValueDataByNode\\s*\\(" "$FILE" | head -n 20

# Show around first method definition call
cand=$(rg -n "addInputValueDataByNode\\s*\\(" "$FILE" | awk -F: 'NR==1{print $1}')
echo "First addInputValueDataByNode occurrence line: $cand"
awk -v s=$cand-10 -v e=$cand+60 'NR>=s && NR<=e {printf "%6d\t%s\n", NR, $0}' "$FILE"

Repository: wundergraph/cosmo

Length of output: 768


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="composition/src/v1/normalization/normalization-factory.ts"
WALKERS="composition/src/v1/normalization/walkers.ts"

echo "== walkers.ts: signatures around 57-117 =="
awk 'NR>=50 && NR<=130 {print NR "\t" $0}' "$WALKERS"

echo "== walkers.ts: signatures around 273-420 =="
awk 'NR>=260 && NR<=430 {print NR "\t" $0}' "$WALKERS"

echo "== normalization-factory.ts: extractDirectives method region around first occurrence line 568 =="
awk 'NR>=540 && NR<=640 {print NR "\t" $0}' "$FILE"

echo "== normalization-factory.ts: addInputValueDataByNode method region =="
# find definition line
ADD_LINE=$(rg -n "addInputValueDataByNode\\s*\\(" "$FILE" | head -n 1 | awk -F: '{print $1}')
echo "addInputValueDataByNode appears at line: $ADD_LINE"
awk -v s=$((ADD_LINE-40)) -v e=$((ADD_LINE+120)) 'NR>=s && NR<=e {print NR "\t" $0}' "$FILE"

echo "== normalization-factory.ts: referencedDirectiveNames updates (add to set) =="
rg -n "referencedDirectiveNames\\.(add|delete|clear|forEach)" "$FILE" || true

echo "== normalization-factory.ts: where directives extracted from InputValueDefinitionNode are used =="
# look for "node.directives" and extractDirectives usage together
rg -n "node\\.directives|directives\\s*\\:" "$FILE" | head -n 50

echo "== normalization-factory.ts: extractDirectives call sites that might feed referencedDirectiveNames =="
rg -n "extractDirectives\\(" "$FILE" | head -n 30

Repository: wundergraph/cosmo

Length of output: 22380


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for directive argument directive emission usage
rg -n "DirectiveArgumentData|directive argument|directivesByName" composition/src/v1 -S | head -n 80

# Narrow to directive-definition-data + schema-building areas
rg -n "directivesByName" composition/src/v1/directive-definition-data composition/src/v1/schema-building 2>/dev/null || true

# Find where directive arguments are converted to AST nodes
rg -n "getMutableInputValueNode|getMutableDirective|directive.*argument|ArgumentData" composition/src/v1 -S | head -n 80

# Specifically check for printing of directive definitions
rg -n "print\\(|DocumentNode|definitions\\)" composition/src/v1/normalization/normalization-factory.ts | head -n 80

Repository: wundergraph/cosmo

Length of output: 18455


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="composition/src/v1/normalization/normalization-factory.ts"
WALKERS="composition/src/v1/normalization/walkers.ts"

echo "== referencedDirectiveNames usage in normalization-factory.ts =="
rg -n "referencedDirectiveNames" "$FILE"

echo "== referencedDirectiveNames added anywhere else besides walkers.ts =="
rg -n "referencedDirectiveNames\\.(add|set)" composition/src/v1/normalization "$FILE" composition/src/v1/normalization/walkers.ts || true

echo "== find where DirectiveArgumentData.directivesByName is used/emitted =="
rg -n "DirectiveArgumentData|directivesByName" composition/src/v1/directive-definition-data composition/src/v1/schema-building composition/src/v1/federation composition/src/v1/normalization -S | head -n 200

# More focused: directive-definition-data utilities likely build AST nodes
rg -n "directive.*argument|arguments.*directive|getMutableInputValueNode|getMutableDirective|print.*directive" composition/src/v1 -S | head -n 200

# Also check if directive definition emission uses directives on directive-argument nodes
rg -n "directive.*@|newDirectiveDefinitionData|extractDirectiveArgumentData|newDirectiveArgumentData|directivesByName" composition/src/v1 -S | head -n 200

Repository: wundergraph/cosmo

Length of output: 20479


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="composition/src/v1/normalization/normalization-factory.ts"

# Locate addInputValueDataByNode definition
ADD_INPUT_LINE=$(rg -n "addInputValueDataByNode\\s*\\(" "$FILE" | head -n 1 | awk -F: '{print $1}')
echo "addInputValueDataByNode appears at line: $ADD_INPUT_LINE"
awk -v s=$((ADD_INPUT_LINE-60)) -v e=$((ADD_INPUT_LINE+200)) 'NR>=s && NR<=e {printf "%6d\t%s\n", NR, $0}' "$FILE"

# Locate where InputValueData directivesByName might be used for printing/AST generation
# Search for "inputValue.*directives" and "getValidFlattenedDirectiveArray" areas near schema printing
rg -n "inputValue.*directivesByName|directivesByName.*inputValue|InputValueData" "$FILE" | head -n 60
rg -n "getValidFlattenedDirectiveArray\\(" "$FILE"
# Show a region around schemaNodeFromData or getValidFlattenedDirectiveArray usage
SCHEMA_FN_LINE=$(rg -n "getValidFlattenedDirectiveArray\\(" "$FILE" | head -n 1 | awk -F: '{print $1}')
echo "first getValidFlattenedDirectiveArray call at: $SCHEMA_FN_LINE"
awk -v s=$((SCHEMA_FN_LINE-80)) -v e=$((SCHEMA_FN_LINE+180)) 'NR>=s && NR<=e {printf "%6d\t%s\n", NR, $0}' "$FILE" | head -n 260

# Locate printing of directive definitions including argument nodes
rg -n "DirectiveDefinitionNode|kind: Kind\\.DIRECTIVE_DEFINITION|directive.*Definition" "$FILE" | head -n 60

# Search for "directives: .*getValidFlattenedDirectiveArray" near where directive definitions are built
rg -n "directive.*directives|directives: this\\.getValidFlattenedDirectiveArray|directives: .*getValidFlattenedDirectiveArray" "$FILE" | head -n 80

Repository: wundergraph/cosmo

Length of output: 28174


🏁 Script executed:

#!/bin/bash
set -euo pipefail

BASE="composition/src/v1"
echo "== Locate directive-definition-data directory =="
find "$BASE" -maxdepth 4 -type d -name "*directive*definition*" -print

echo "== List potential directive-definition-data files =="
find "$BASE" -maxdepth 6 -type f | rg -i "directive-definition-data" | head -n 50

echo "== Search for directive argument AST generation =="
rg -n "DirectiveArgument|argumentDataByName|getMutable.*Directive|getValidFlattenedDirectiveArray\\(|getDirective.*Argument|directive.*arguments" "$BASE" | head -n 80

echo "== Search for printing of directive definitions (SDL AST nodes) =="
rg -n "kind: Kind\\.DIRECTIVE_DEFINITION|Kind\\.DIRECTIVE_DEFINITION|directiveDefinition.*node|newDirectiveDefinitionData|get.*DirectiveDefinition" "$BASE" | head -n 80

echo "== Search for use of directivesByName in directive argument node building =="
rg -n "argument.*directivesByName|directivesByName: .*argument|getValidFlattenedDirectiveArray\\(.*directivesByName.*argument" "$BASE" | head -n 120

Repository: wundergraph/cosmo

Length of output: 163


🏁 Script executed:

#!/bin/bash
set -euo pipefail
BASE="."

echo "== Find directive-definition-data directory/files =="
fd -t f -i "directive-definition-data" . | head -n 50
fd -t d -i "directive-definition-data" . | head -n 50

echo "== Search for newDirectiveDefinitionData / newDirectiveArgumentData =="
rg -n "newDirectiveDefinitionData|newDirectiveArgumentData" . | head -n 50

echo "== Search for argNode.directives usage (directives on directive-definition arguments) =="
rg -n "argNode\\.directives|argument.*directives.*directive.*definition|directives.*directive.*argument" . | head -n 50

echo "== Print implementations around addDirectiveDefinition argument cost + directive extraction =="
# likely in normalization-factory.ts already; but also in directive-definition-data/utils or types
rg -n "extract.*Directives\\(|directivesByName: .*argument|directivesByName.*argument" composition/src/v1 ../../ 2>/dev/null | head -n 80

Repository: wundergraph/cosmo

Length of output: 50375


Track directives used on directive-definition arguments in the schema-walker.

Kind.DIRECTIVE_DEFINITION currently does not visit definitionNode.arguments directives, so those directive usages can be carried into the normalized output while the corresponding built-in directive definitions are not emitted (since emitted built-ins come from referencedDirectiveNames and directive usages on directive-definition arguments never add to it).

🐛 Minimal fix
       case Kind.DIRECTIVE_DEFINITION: {
         if (nf.addDirectiveDefinitionDataByNode(definitionNode)) {
           nf.customDirectiveDefinitionByName.set(definitionNode.name.value, definitionNode);
         }
-        // Directives on the definition's arguments are intentionally not visited.
+        if (definitionNode.arguments) {
+          for (const inputValueNode of definitionNode.arguments) {
+            visitDirectiveNodes(nf, inputValueNode.directives);
+          }
+        }
         break;
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case Kind.DIRECTIVE_DEFINITION: {
if (nf.addDirectiveDefinitionDataByNode(definitionNode)) {
nf.customDirectiveDefinitionByName.set(definitionNode.name.value, definitionNode);
}
// Directives on the definition's arguments are intentionally not visited.
break;
case Kind.DIRECTIVE_DEFINITION: {
if (nf.addDirectiveDefinitionDataByNode(definitionNode)) {
nf.customDirectiveDefinitionByName.set(definitionNode.name.value, definitionNode);
}
if (definitionNode.arguments) {
for (const inputValueNode of definitionNode.arguments) {
visitDirectiveNodes(nf, inputValueNode.directives);
}
}
break;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@composition/src/v1/normalization/walkers.ts` around lines 125 - 130, In the
Kind.DIRECTIVE_DEFINITION case, also record directive usages that appear on the
directive definition's arguments so they contribute to referencedDirectiveNames;
after nf.addDirectiveDefinitionDataByNode(...) (and/or
nf.customDirectiveDefinitionByName.set(...)) iterate definitionNode.arguments
(if any) and for each argument's directives add their .name.value into
nf.referencedDirectiveNames (or call the existing normalizer method that records
a referenced directive name) so these directive usages are tracked even though
argument nodes themselves are not otherwise visited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant