CodSpeed regression: highspy 1.15.0 addRows ~4x slower in direct HiGHS handoff (#809 bump)

_Intent placeholder — @FBumann to replace with your own words._

> [!NOTE]
> The following was generated by AI (investigation + reproduction of the CodSpeed failure).

## What happened

CodSpeed went red on `master` with a −16.66% regression: [failing run](https://github.com/PyPSA/linopy/runs/84771255430). All 7 regressed benchmarks are `test_to_solver[highs-*]`; gurobi and every non-solver benchmark are unaffected.

The only relevant change between the compared commits is the **highspy bump `1.13.1 → 1.15.0`** (#809). The `benchmarks` extra pins highspy exactly *specifically so dep bumps don't move the numbers* — dependabot bumped that very pin, so the run measured the dependency, and it revealed a real upstream slowdown.

## Root cause

The regression is in **`highspy.Highs.addRows`**, called by linopy's direct HiGHS handoff (`linopy/solvers.py`, `Highs._build_solver_model`). On `sparse_network(250)` the real `to_highspy` path is ~2× slower end-to-end (the `addRows` call itself ~4×):

| highspy | `to_highspy(sparse_network(250))` |
| ------- | --------------------------------- |
| 1.13.1  | ~22 ms |
| 1.15.0  | ~46 ms |

## Reproduction

Single PEP 723 script, linopy-native, runnable directly with `uv` — mirrors the CI benchmark via the pytest-benchmark `benchmark` fixture (no manual timing, no hand-built matrices). Lives at `dev-scripts/highs-addrows-repro/`; every dep but `highspy` is pinned in the accompanying `.lock`, and `highspy` is injected per run so the two versions share an identical env.

```bash
uv run --locked --with highspy==1.13.1 highs_handoff_bench.py --benchmark-columns=median,mean
uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-columns=median,mean
```

```
matrices.A rows=6000 cols=12000 stored_nnz=1506000 structural_nnz=18000 (98.8% explicit zeros)

  Name (time in ms)    Median
  test_to_highspy  [1.13.1]  22.2
  test_to_highspy  [1.15.0]  46.4
```

<details>
<summary>highs_handoff_bench.py</summary>

```python
# /// script
# requires-python = ">=3.12,<3.13"
# dependencies = [
#     "linopy",
#     "pytest",
#     "pytest-benchmark",
#     "pytest-benchmem",  # memory companion (memray peak pass): shows #814
# ]
# ///
"""Reproduce the HiGHS handoff slowdown -- linopy-native, runnable via uv.

Mirrors the CI benchmark ``test_to_solver[highs-sparse_network-n=250]``: build
the linopy model, then hand it to HiGHS through pytest-benchmark's ``benchmark``
fixture. No manual timing, no hand-built matrices. highspy is the variable under
test, injected per run so both versions share an identical (locked) env:

    # time only (matches CI / CodSpeed WallTime)
    uv run --locked --with highspy==1.13.1 highs_handoff_bench.py --benchmark-columns=median,mean
    uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-columns=median,mean

    # time + peak memory (pytest-benchmem, via memray) -- shows #814 too
    uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-memory

Regenerate the lock with:  uv lock --script highs_handoff_bench.py
"""

from __future__ import annotations

import sys

import numpy as np
import pandas as pd
import pytest
import xarray as xr

import linopy


def build_sparse_network(n_buses: int = 250) -> linopy.Model:
    """The ``sparse_network`` benchmark: a ring network flow balance.

    ``flow * incidence`` broadcasts against a dense bus x line block -- ordinary
    linopy modelling, and the reason the constraint matrix fills with zeros.
    """
    rng = np.random.default_rng(42)
    n_lines = n_buses
    n_time = min(n_buses, 24)
    buses = pd.RangeIndex(n_buses, name="bus")
    lines = pd.RangeIndex(n_lines, name="line")
    time_ = pd.RangeIndex(n_time, name="time")

    bus_from = np.arange(n_lines)
    bus_to = (bus_from + 1) % n_buses

    m = linopy.Model()
    gen = m.add_variables(lower=0, coords=[buses, time_], name="gen")
    flow = m.add_variables(lower=-100, upper=100, coords=[lines, time_], name="flow")

    incidence = np.zeros((n_buses, n_lines))
    incidence[bus_to, np.arange(n_lines)] = 1
    incidence[bus_from, np.arange(n_lines)] = -1
    incidence_da = xr.DataArray(incidence, coords=[buses, lines])

    demand = xr.DataArray(rng.uniform(10, 100, size=(n_buses, n_time)), coords=[buses, time_])
    net_flow = (flow * incidence_da).sum("line")
    m.add_constraints(gen + net_flow == demand, name="balance")
    m.add_objective(gen.sum())
    return m


@pytest.fixture(scope="module")
def model() -> linopy.Model:
    m = build_sparse_network(250)
    A = m.matrices.A  # trigger + report the explicit-zero fill (#814)
    zeros = A.nnz - int(np.count_nonzero(A.data))
    print(
        f"\nmatrices.A rows={A.shape[0]} cols={A.shape[1]} "
        f"stored_nnz={A.nnz} structural_nnz={A.nnz - zeros} "
        f"({100 * zeros / A.nnz:.1f}% explicit zeros)"
    )
    return m


def test_to_highspy(benchmark, model: linopy.Model) -> None:
    benchmark(lambda: linopy.io.to_highspy(model))


if __name__ == "__main__":
    # Run this file as its own pytest session; ``-o addopts=`` drops any config
    # inherited from a surrounding repo so the script is self-contained.
    raise SystemExit(
        pytest.main(
            [__file__, "-q", "--benchmark-only", "-o", "addopts=", "-p", "no:cacheprovider", *sys.argv[1:]]
        )
    )

```
</details>

## Options

- Report upstream to [ERGO-Code/HiGHS](https://github.com/ERGO-Code/HiGHS) — the slow path is `addRows`.
- Cap/hold highspy (keep `benchmarks` on 1.13.1; consider `!=1.15.0` in the `solvers` extra) until upstream is resolved, or acknowledge the CodSpeed baseline step.
- **Independent of this:** #814 removes the ~98.8% explicit zeros we hand `addRows` in the first place, which makes linopy insensitive to this regression (and faster on every highspy version). That's an optimisation, not the fix for the upstream slowdown, so it's tracked separately.

<details>
<summary>All regressed benchmarks (BASE 848366b / 1.13.1 → HEAD e861678 / 1.15.0)</summary>

| Benchmark | BASE | HEAD | Efficiency |
| --- | --- | --- | --- |
| test_to_solver[highs-sparse_network-n=250] | 83.5 ms | 162.7 ms | -48.7% |
| test_to_solver[highs-kvl_cycles-severity=100] | 674.5 ms | 916.2 ms | -26.38% |
| test_to_solver[highs-cumsum-severity=100] | 177.5 ms | 217.2 ms | -18.27% |
| test_to_solver[highs-rolling-severity=50] | 540.3 ms | 658.5 ms | -17.96% |
| test_to_solver[highs-cumsum-severity=50] | 44.4 ms | 53.4 ms | -16.82% |
| test_to_solver[highs-kvl_cycles-severity=0] | 1.2 s | 1.4 s | -14% |
| test_to_solver[highs-kvl_cycles-severity=50] | 976.7 ms | 1,122.3 ms | -12.97% |
</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CodSpeed regression: highspy 1.15.0 addRows ~4x slower in direct HiGHS handoff (#809 bump) #813

What happened

Root cause

Reproduction

Options

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Benchmark	BASE	HEAD	Efficiency
test_to_solver[highs-sparse_network-n=250]	83.5 ms	162.7 ms	-48.7%
test_to_solver[highs-kvl_cycles-severity=100]	674.5 ms	916.2 ms	-26.38%
test_to_solver[highs-cumsum-severity=100]	177.5 ms	217.2 ms	-18.27%
test_to_solver[highs-rolling-severity=50]	540.3 ms	658.5 ms	-17.96%
test_to_solver[highs-cumsum-severity=50]	44.4 ms	53.4 ms	-16.82%
test_to_solver[highs-kvl_cycles-severity=0]	1.2 s	1.4 s	-14%
test_to_solver[highs-kvl_cycles-severity=50]	976.7 ms	1,122.3 ms	-12.97%

Uh oh!

CodSpeed regression: highspy 1.15.0 addRows ~4x slower in direct HiGHS handoff (#809 bump) #813

Description

What happened

Root cause

Reproduction

Options

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions