perf: scatter groupby-sum terms directly instead of unstacking#793
Open
FBumann wants to merge 2 commits into
Open
perf: scatter groupby-sum terms directly instead of unstacking#793FBumann wants to merge 2 commits into
FBumann wants to merge 2 commits into
Conversation
The fast path of LinearExpression.groupby(...).sum() used ds.unstack(group_dim, fill_value=...) followed by a stack, which materializes 2-3 intermediate copies of the padded result (n_groups x max_group_size x nterm) and goes through pandas MultiIndex machinery sized by the number of elements. Instead, factorize the groups and scatter coeffs/vars directly into the preallocated padded result arrays; constants are group-summed with np.add.at. Peak memory drops to input + result (the minimum for the padded layout) and the grouping itself gets considerably faster. The result is unchanged: same dims, coords, term ordering and padding. The unstack-based implementation is kept as _sum_by_unstack and still used for chunked (dask-backed) data, which cannot be scattered into numpy arrays. NaN group labels now raise an informative ValueError instead of failing inside unstack. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merging this PR will improve performance by ×2.1
Performance Changes
Tip Curious why this is faster? Comment Comparing Footnotes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
The following content was generated by AI.
What this does
The fast path of
LinearExpression.groupby(...).sum()previously didds.unstack(group_dim, fill_value=...)followed by a stack. Thatmaterializes 2–3 intermediate copies of the padded result
(
n_groups × max_group_size × nterm) and routes through pandasMultiIndexmachinery sized by the number of elements.This change factorizes the groups and scatters
coeffs/varsdirectlyinto preallocated padded result arrays; constants are group-summed with
np.add.at. Peak memory drops to input + result (the minimum for thepadded layout), and the grouping itself gets considerably faster. The
result is unchanged: same dims, coords, term ordering and padding.
The unstack-based implementation is kept as
_sum_by_unstackand isstill used for chunked (dask-backed) data, which cannot be scattered into
numpy arrays.
NaNgroup labels now raise an informativeValueErrorinstead of failing inside
unstack.Notes
linopy/expressions.pyand adds tests intest/test_linear_expression.py(124 new lines)._sum_by_unstackretains the currentmasternames_to_droplogic(drop every coordinate aligned to
group_dim), so the slow path keepsthe existing behavior.
Verification
pytest test/test_linear_expression.py→ 309 passed-k "group or sum or scatter or unstack") → 73 passedruff check,ruff format --check,mypy linopy/expressions.py→ clean