[SPARK-57714][CORE] Add loser tree based k-way merger for UnsafeExternalSorter spill merge#56804
Open
s-unscrupulous wants to merge 1 commit into
Open
[SPARK-57714][CORE] Add loser tree based k-way merger for UnsafeExternalSorter spill merge#56804s-unscrupulous wants to merge 1 commit into
s-unscrupulous wants to merge 1 commit into
Conversation
a9dbcb1 to
c50a765
Compare
uros-b
reviewed
Jun 26, 2026
| "in UnsafeExternalSorter; otherwise use the original priority-queue based merger. " + | ||
| "Loser tree performs one comparison per replay versus two for heap sift-down, which " + | ||
| "can be faster when there are many spills.") | ||
| .version("4.1.0") |
Member
There was a problem hiding this comment.
Suggested change
| .version("4.1.0") | |
| .version("4.3.0") |
…nalSorter spill merge ### What changes were proposed in this pull request? Add a loser tree based k-way merger as an opt-in alternative to the existing priority-queue (binary heap) based `UnsafeSorterSpillMerger` used by `UnsafeExternalSorter` to merge spill files. The new merger is gated behind config `spark.unsafe.sorter.spill.merger.useLoserTree`, defaulting to `false` so behavior is unchanged unless explicitly enabled. ### Why are the changes needed? When `UnsafeExternalSorter` produces many spill files, the merge phase is dominated by comparator cost. A loser tree performs one comparison and one loser-slot read per level on each replay, versus two comparisons plus two child reads for heap sift-down. For k-way merges this roughly halves both the number of comparisons and the array touches per popped record, improving end-to-end sort latency for spill-heavy workloads. Microbenchmark on Apple M4 / JDK 17 over 500K records per run shows **1.4x to 2.9x speedup** across prefix-collision levels and run counts from 4 to 256. See `core/benchmarks/SpillMergerBenchmark-results.txt`. | runs | low-collision | medium-collision | high-collision | | ---: | :-----------: | :--------------: | :------------: | | 4 | 1.5x | 1.9x | 1.4x | | 8 | 1.8x | 2.3x | 1.6x | | 16 | 2.2x | 2.5x | 1.7x | | 64 | 2.9x | 2.0x | 1.8x | | 256 | 2.6x | 1.8x | 1.8x | ### Does this PR introduce _any_ user-facing change? No. The new behavior is gated behind an internal config defaulted to `false`. Existing behavior is unchanged. ### How was this patch tested? - New unit test suite `UnsafeLoserTreeSpillMergerSuite` (12 tests, all pass) - Extended `UnsafeExternalSorterSuite` to cover the loser tree path - New `SpillMergerBenchmark` for performance comparison; results included under `core/benchmarks/` - Ran `./dev/scalastyle` and `./dev/lint-java` locally — both pass ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude (Anthropic)
c50a765 to
6ea24c5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add a loser tree based k-way merger as an opt-in alternative to the
existing priority-queue (binary heap) based
UnsafeSorterSpillMergerused by
UnsafeExternalSorterto merge spill files.The new merger is gated behind config
spark.unsafe.sorter.spill.merger.useLoserTree, defaulting tofalseso behavior is unchanged unless explicitly enabled.
Why are the changes needed?
When
UnsafeExternalSorterproduces many spill files, the merge phaseis dominated by comparator cost. A loser tree performs one comparison
and one loser-slot read per level on each replay, versus two
comparisons plus two child reads for heap sift-down. For k-way merges
this roughly halves both the number of comparisons and the array
touches per popped record, improving end-to-end sort latency for
spill-heavy workloads.
Microbenchmark on Apple M4 / JDK 17 over 500K records per run shows
1.4x to 2.9x speedup across prefix-collision levels and run counts
from 4 to 256. Full results in
core/benchmarks/SpillMergerBenchmark-results.txt.Concrete numbers at the most spill-heavy point (runs=256, 500K records/run, low collision):
Does this PR introduce any user-facing change?
No. The new behavior is gated behind an internal config defaulted to
false. Existing behavior is unchanged.How was this patch tested?
UnsafeLoserTreeSpillMergerSuite(12 tests, all pass)UnsafeExternalSorterSuiteto cover the loser tree pathSpillMergerBenchmarkfor performance comparison; resultscommitted under
core/benchmarks/./dev/scalastyleand./dev/lint-javalocally — both passWas this patch authored or co-authored using generative AI tooling?
Generated-by: Claude (Anthropic)