Skip to content

[Experiment] Combine CI Build and Queue Tests into a single invocation#54933

Draft
mmitche wants to merge 3 commits into
dotnet:mainfrom
mmitche:mmitche/combine-build-test-steps
Draft

[Experiment] Combine CI Build and Queue Tests into a single invocation#54933
mmitche wants to merge 3 commits into
dotnet:mainfrom
mmitche:mmitche/combine-build-test-steps

Conversation

@mmitche

@mmitche mmitche commented Jun 23, 2026

Copy link
Copy Markdown
Member

Experiment: combine the CI Build and Queue Tests steps

⚠️ Draft / experiment. The goal is to evaluate the effect of merging the two build.{ps1,sh} invocations in eng/pipelines/templates/jobs/sdk-build.yml. Please do not merge as-is.

Background

The SDK build job runs two sequential invocations in the same ADO job:

  1. Build-restore -build -pack
  2. Queue Tests-restore -test scoped to test/UnitTests.proj

The second invocation re-restores and recompiles much of the product. test/UnitTests.proj publishes every *.Tests.csproj with a RuntimeIdentifier, which propagates the RID down all product ProjectReferences into RID-qualified output folders (.../<tfm>/<rid>/). So ~55 product projects and the whole restore graph are processed twice across the two steps.

What this PR does

  • sdk.slnx now includes test/UnitTests.proj (the Helix orchestrator), so test queuing can happen as part of a single build.
  • sdk.slnf (new) is a default/build-only filter = sdk.slnx minus UnitTests.proj. Build-only legs (runTests=false) and the dotnet-format integration leg use it so they neither pull in the orchestrator nor queue Helix work.
  • eng/XUnitV3/XUnitV3.Runner.targets — the in-process RunTests target is now guarded on CustomHelixTargetQueue == ''. Building the full solution with -test therefore does not run every *.Tests.csproj on the build agent; UnitTests.proj still submits them to Helix.
  • eng/pipelines/templates/jobs/sdk-build.yml — the separate Build + Queue Tests steps are replaced with a single Build and Test step when runTests=true (full sdk.slnx, -build -pack -test + Helix props), and a Build step otherwise (sdk.slnf).

Known caveats (why it's a draft)

  • Build ordering: UnitTests.proj has no ProjectReferences to the product, so a single solution build may schedule its RID publish relative to the product differently. The draft CI run is meant to surface this.
  • Local dev: build.cmd/build.sh with no args auto-detects sdk.slnx, which now includes UnitTests.proj and its heavy RID publish. Lean local builds should pass -projects sdk.slnf.
  • sdk.slnf maintenance: it enumerates all projects except UnitTests.proj and must be kept in sync as projects are added/removed (.slnf has no exclusion syntax).
  • The RID double-compile is not eliminated (it originates in UnitTests.proj's publish). This experiment measures the restore/bootstrap savings from not running two full invocations, not compile elimination.

🤖 Generated with assistance from GitHub Copilot.

mmitche and others added 3 commits June 22, 2026 17:29
Today the sdk-build.yml job runs two sequential build.{ps1,sh} invocations:
a "Build" step (-restore -build -pack) and a "Queue Tests" step
(-restore -test, scoped to test/UnitTests.proj). The second step re-restores
and recompiles much of the product because UnitTests.proj publishes every
*.Tests.csproj with a RuntimeIdentifier, propagating the RID through all
product ProjectReferences into RID-qualified output folders.

This experimental change merges the two into one invocation to measure the
savings:

- sdk.slnx now includes test/UnitTests.proj (the Helix orchestrator) so test
  queuing can happen as part of the same build.
- sdk.slnf is a new default/build-only filter: sdk.slnx minus UnitTests.proj.
  Build-only legs (runTests=false) and the dotnet-format integration leg use
  it so they neither pull in the orchestrator nor queue Helix work.
- eng/XUnitV3/XUnitV3.Runner.targets guards the in-process RunTests target on
  CustomHelixTargetQueue being empty, so building the full solution with -test
  does NOT run every *.Tests.csproj on the build agent; UnitTests.proj still
  submits them to Helix.
- sdk-build.yml replaces the separate Build + Queue Tests steps with a single
  combined "Build and Test" step when runTests=true (full sdk.slnx, -build
  -pack -test + Helix props) and a "Build" step otherwise (sdk.slnf).

Known caveats (this is a draft/experiment):
- Build ordering: UnitTests.proj has no ProjectReferences to the product, so a
  single solution build may schedule its publish relative to the product in a
  new way; the draft CI run is intended to surface this.
- Local `build.cmd`/`build.sh` with no args auto-detects sdk.slnx, which now
  includes UnitTests.proj and its heavy RID publish; devs wanting a lean build
  should pass `-projects sdk.slnf`.
- sdk.slnf enumerates all projects except UnitTests.proj and must be kept in
  sync as projects are added/removed.
- The RID double-compile itself is not eliminated (it originates in
  UnitTests.proj's publish); this measures the restore/bootstrap savings from
  not running two full invocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… adding .proj to .slnx

The first attempt added test/UnitTests.proj to sdk.slnx and generated sdk.slnf to
exclude it. That broke every CI leg: the .slnx solution format (SolutionPersistence)
has no registered project type for the .proj extension, so loading sdk.slnx failed
with "ProjectType '' not found" (MSB4025) for all consumers of the solution/filter.

Instead, keep sdk.slnx unchanged and have the combined "Build and Test" step build
both the product solution and the Helix orchestrator in one invocation via
  /p:Projects=sdk.slnx;test/UnitTests.proj
MSBuild builds listed projects in order, so the product is built before
UnitTests.proj publishes/queues the tests. Build-only legs build just sdk.slnx,
exactly as the former 'Build' step did.

This removes sdk.slnf entirely (and its sync fragility) and reverts the sdk.slnx and
dotnet-format-integration.yml edits. The RunTests in-process guard in
eng/XUnitV3/XUnitV3.Runner.targets is retained so the per-project *.Tests.csproj do
not run on the build agent when queuing to Helix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MSBuild's /p: switch uses ';' to separate multiple property assignments, so
passing /p:Projects=sdk.slnx;test/UnitTests.proj unquoted caused MSBuild to
parse the second path as its own (invalid) switch:
  MSBUILD : error MSB1006: Property is not valid.
  Switch: .../test/UnitTests.proj

Wrap the value in inner double quotes ('/p:Projects="a;b"') so MSBuild treats
the semicolon-separated list as a single property value, matching the original
Queue Tests step. The Windows leg already used the equivalent escaped-quote form.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mmitche

mmitche commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Experiment CI results (build 1477997)

Plumbing now works (argument parsing fixed, sdk.slnx untouched). One leg — TestBuild: linux (arm64) — built+tested+queued end-to-end successfully. However, the other legs fail with two deterministic errors that reveal the core obstacle:

  1. MSB4057 — The target "Test" does not exist on TestAsset package projects, e.g. test/TestAssets/TestPackages/TemplateEngine/Microsoft.TemplateEngine.TestTemplates.csproj. These are in sdk.slnx but don't import Arcade's test targets, so a solution-wide -test can't invoke Test on them.
  2. NETSDK1047 — assets file doesn't have a target for <tfm>/<rid> (e.g. net11.0/win-x64, net10.0/linux-x64) for *.Tests.csproj. The single solution restore is RID-agnostic; the test projects need RID-qualified assets that the old Queue Tests step produced via UnitTests.proj's publish/restore.

Root cause: a single Arcade build.sh -build -pack -test applies its targets uniformly to every entry in /p:Projects. Adding the product solution to a -test invocation therefore tries to test the entire solution, which it was never designed for — exactly the reason Build and Queue Tests were originally split. The RunTests in-process guard only suppresses xUnit execution; it can't address the missing Test target or the RID restore gap.

Conclusion: naively merging into one invocation isn't viable without either (a) scoping the Test target to only UnitTests.proj (not expressible in a single uniform Arcade invocation), (b) RID-restoring the whole solution (reintroducing the duplication we're trying to remove), or (c) Arcade/test-infra changes. Holding here for direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant