Is your feature request related to a problem or challenge?
SortMergeJoinExec currently always runs in a symmetric, hash-partitioned mode:
both inputs are hash-partitioned on the join keys and sorted, and partition i
of the left is merge-joined with partition i of the right.
When one side is small, hash-repartitioning the large side is wasteful. For
hash joins this is already handled by PartitionMode::CollectLeft (collect the
small build side once, broadcast it, don't repartition the probe side). This PR
brings the same idea to sort-merge joins.
Describe the solution you'd like
Implement CollectLeft mode for SMJ
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
SortMergeJoinExec currently always runs in a symmetric, hash-partitioned mode:
both inputs are hash-partitioned on the join keys and sorted, and partition i
of the left is merge-joined with partition i of the right.
When one side is small, hash-repartitioning the large side is wasteful. For
hash joins this is already handled by PartitionMode::CollectLeft (collect the
small build side once, broadcast it, don't repartition the probe side). This PR
brings the same idea to sort-merge joins.
Describe the solution you'd like
Implement CollectLeft mode for SMJ
Describe alternatives you've considered
No response
Additional context
No response