POC: Time-based Helix test scheduling with AzDO history#54939
Draft
MichaelSimons wants to merge 13 commits into
Draft
POC: Time-based Helix test scheduling with AzDO history#54939MichaelSimons wants to merge 13 commits into
MichaelSimons wants to merge 13 commits into
Conversation
Adds time-based work item scheduling inspired by dotnet/roslyn's AssemblyScheduler. Instead of partitioning by method count, this uses historical test execution durations from Azure DevOps to create Helix work items targeting ~10 minutes each at the individual test method level. New files: - AzdoClient.cs: Lightweight REST client for AzDO builds/test results API - TestHistoryManager.cs: Fetches per-test duration history from last successful CI build, with branch fallback - TestMethodDiscovery.cs: Discovers individual test methods from compiled assemblies using reflection metadata - TimeBasedScheduler.cs: Greedy first-fit bin-packing scheduler with configurable target time, command-line length limits, and count-based fallback when history is unavailable - HelixTasks.SchedulerTool/: Local console app for validating scheduling plans without running in CI Modified: - SDKCustomCreateXUnitWorkItemsWithTestExclusion.cs: Added UseTimeBasedScheduling mode with AzDO parameters, integrated time-based scheduling path alongside existing count-based approach - HelixTasks.csproj: Added System.Text.Json, InternalsVisibleTo The existing count-based scheduling is preserved as the default and serves as fallback when history is unavailable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update XUnitRunner.targets to pass all new time-based scheduling properties to SDKCustomCreateXUnitWorkItemsWithTestExclusion. Add auto-configuration in UnitTests.proj using AzDO built-in variables: - AzdoProjectUri: derived from SYSTEM_COLLECTIONURI + SYSTEM_TEAMPROJECT - AzdoAccessToken: from SYSTEM_ACCESSTOKEN (already mapped in sdk-build.yml) - AzdoDefinitionId: from SYSTEM_DEFINITIONID - AzdoTargetBranch: from SYSTEM_PULLREQUEST_TARGETBRANCH (falls back to main) To enable: set UseTimeBasedScheduling=true in the pipeline or UnitTests.proj. All other config is auto-derived from the pipeline environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The method-level filter strings (FullyQualifiedName per test) are much longer than the old class-level filters. On Windows, cmd.exe has an 8191-character command line limit, so many work items were failing with 'The input line is too long' (exit code 255). Fix: Make MaxFilterLength OS-aware: - Windows: 7000 chars (leaving ~1200 for the command prefix) - POSIX: 25000 chars (bash supports ~128KB+) Also enforce the filter length limit in the count-based fallback path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of passing the method-level --filter on the command line (which hits the 8191-char cmd.exe limit on Windows), write each work item's filter to a .rsp response file in the publish directory and reference it via @file.rsp on the command line. This is the same approach used by dotnet/roslyn's Helix test runner. The filter string can now be arbitrarily long, so work items are sized purely by time budget (or count-based fallback), not constrained by command-line length. The TimeBasedScheduler's MaxFilterLength is now set to 100K (effectively unlimited) since the rsp file has no length constraint. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With filters in response files, work item sizing is purely driven by the time budget (or count for fallback). Remove all filter-length tracking and the isPosixShell parameter from TimeBasedScheduler. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of 'dotnet test @filter.rsp' (which expands the RSP and hits the CreateProcess 32K limit), invoke vstest.console.dll directly: dotnet exec vstest.console.dll @workitem.rsp The RSP file contains ALL arguments (assembly, loggers, blame, filter) and vstest.console.dll reads it natively without spawning a child process — completely eliminating any command-line length constraint. This matches the approach used by dotnet/roslyn's Helix test runner. MTP projects continue to use dotnet exec with the test assembly directly since they already handle arguments without the CreateProcess issue. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cmd.exe expands %variables% at parse time, so 'set /p var=<file&& dotnet exec %var%' expands %var% to empty string before set runs. Fix: write a .cmd batch script to the payload directory where each line is parsed independently. The Helix command is just the script filename. POSIX continues to use inline commands since \ is evaluated at runtime. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
xUnit: set parallelizeAssembly and parallelizeTestCollections to false MSTest: set MSTestParallelizeWorkers to 1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Setting MSTestParallelizeWorkers=1 still causes MSTest targets to inject [Parallelize], which conflicts with [DoNotParallelize] attributes in several test projects. Setting scope to None prevents the attribute from being generated entirely, and is compatible with projects that already set MSTestParallelizeScope=None locally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This was referenced Jun 23, 2026
On Windows Helix machines, DOTNET_ROOT may point to a system-installed .NET SDK with an incompatible (older) vstest.console.dll. This caused MissingMethodException crashes in all non-MTP test work items. Use HELIX_CORRELATION_PAYLOAD/d instead, which always contains the custom-built SDK matching the test assemblies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bare method names with exact-match filter missed [Theory]/[InlineData] test cases whose FQN includes parameters (e.g. Method(arg1, arg2)). Using 'FullyQualifiedName~Method' (contains) ensures all parameterized variants are matched, resolving ~2,800 missing tests per leg. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Proof-of-concept replacing the SDK's count-based Helix test partitioning with time-based scheduling, inspired by dotnet/roslyn's
AssemblyScheduler. Uses historical test execution times from Azure DevOps to create work items targeting ~10 minutes each, at the individual test method level.What's Changed
New Files
test/HelixTasks/AzdoClient.cs— Lightweight AzDO REST client (builds + test results APIs)test/HelixTasks/TestHistoryManager.cs— Fetches per-test-method duration history from last successful CI build, with fallback tomaintest/HelixTasks/TestMethodDiscovery.cs— Discovers individual test methods from PE metadata via reflectiontest/HelixTasks/TimeBasedScheduler.cs— Greedy first-fit bin-packing scheduler (10-min target per work item)test/HelixTasks.SchedulerTool/— Local console app for validating scheduling plans offlineModified Files
test/HelixTasks/SDKCustomCreateXUnitWorkItemsWithTestExclusion.cs— AddedUseTimeBasedSchedulingmode with direct vstest.console.dll invocation via RSP filestest/xunit-runner/XUnitRunner.targets— Passes time-based scheduling properties to MSBuild tasktest/UnitTests.proj— Auto-configures from AzDO pipeline variables, enables time-based scheduling by defaultDesign
dotnet exec vstest.console.dll @workitem.rsp— all arguments (assembly, loggers, blame, filter) in a response file read natively by vstest.console.dll, eliminating all command-line length constraints.cmdbatch scripts for correct variable expansionmain