Skip to content

CodSpeed regression: highspy 1.15.0 addRows ~4x slower in direct HiGHS handoff (#809 bump) #813

Description

@FBumann

Intent placeholder — @FBumann to replace with your own words.

Note

The following was generated by AI (investigation + reproduction of the CodSpeed failure).

What happened

CodSpeed went red on master with a −16.66% regression: failing run. All 7 regressed benchmarks are test_to_solver[highs-*]; gurobi and every non-solver benchmark are unaffected.

The only relevant change between the compared commits is the highspy bump 1.13.1 → 1.15.0 (#809). The benchmarks extra pins highspy exactly specifically so dep bumps don't move the numbers — dependabot bumped that very pin, so the run measured the dependency, and it revealed a real upstream slowdown.

Root cause

The regression is in highspy.Highs.addRows, called by linopy's direct HiGHS handoff (linopy/solvers.py, Highs._build_solver_model). On sparse_network(250) the real to_highspy path is ~2× slower end-to-end (the addRows call itself ~4×):

highspy to_highspy(sparse_network(250))
1.13.1 ~22 ms
1.15.0 ~46 ms

Reproduction

Single PEP 723 script, linopy-native, runnable directly with uv — mirrors the CI benchmark via the pytest-benchmark benchmark fixture (no manual timing, no hand-built matrices). Lives at dev-scripts/highs-addrows-repro/; every dep but highspy is pinned in the accompanying .lock, and highspy is injected per run so the two versions share an identical env.

uv run --locked --with highspy==1.13.1 highs_handoff_bench.py --benchmark-columns=median,mean
uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-columns=median,mean
matrices.A rows=6000 cols=12000 stored_nnz=1506000 structural_nnz=18000 (98.8% explicit zeros)

  Name (time in ms)    Median
  test_to_highspy  [1.13.1]  22.2
  test_to_highspy  [1.15.0]  46.4
highs_handoff_bench.py
# /// script
# requires-python = ">=3.12,<3.13"
# dependencies = [
#     "linopy",
#     "pytest",
#     "pytest-benchmark",
#     "pytest-benchmem",  # memory companion (memray peak pass): shows #814
# ]
# ///
"""Reproduce the HiGHS handoff slowdown -- linopy-native, runnable via uv.

Mirrors the CI benchmark ``test_to_solver[highs-sparse_network-n=250]``: build
the linopy model, then hand it to HiGHS through pytest-benchmark's ``benchmark``
fixture. No manual timing, no hand-built matrices. highspy is the variable under
test, injected per run so both versions share an identical (locked) env:

    # time only (matches CI / CodSpeed WallTime)
    uv run --locked --with highspy==1.13.1 highs_handoff_bench.py --benchmark-columns=median,mean
    uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-columns=median,mean

    # time + peak memory (pytest-benchmem, via memray) -- shows #814 too
    uv run --locked --with highspy==1.15.0 highs_handoff_bench.py --benchmark-memory

Regenerate the lock with:  uv lock --script highs_handoff_bench.py
"""

from __future__ import annotations

import sys

import numpy as np
import pandas as pd
import pytest
import xarray as xr

import linopy


def build_sparse_network(n_buses: int = 250) -> linopy.Model:
    """The ``sparse_network`` benchmark: a ring network flow balance.

    ``flow * incidence`` broadcasts against a dense bus x line block -- ordinary
    linopy modelling, and the reason the constraint matrix fills with zeros.
    """
    rng = np.random.default_rng(42)
    n_lines = n_buses
    n_time = min(n_buses, 24)
    buses = pd.RangeIndex(n_buses, name="bus")
    lines = pd.RangeIndex(n_lines, name="line")
    time_ = pd.RangeIndex(n_time, name="time")

    bus_from = np.arange(n_lines)
    bus_to = (bus_from + 1) % n_buses

    m = linopy.Model()
    gen = m.add_variables(lower=0, coords=[buses, time_], name="gen")
    flow = m.add_variables(lower=-100, upper=100, coords=[lines, time_], name="flow")

    incidence = np.zeros((n_buses, n_lines))
    incidence[bus_to, np.arange(n_lines)] = 1
    incidence[bus_from, np.arange(n_lines)] = -1
    incidence_da = xr.DataArray(incidence, coords=[buses, lines])

    demand = xr.DataArray(rng.uniform(10, 100, size=(n_buses, n_time)), coords=[buses, time_])
    net_flow = (flow * incidence_da).sum("line")
    m.add_constraints(gen + net_flow == demand, name="balance")
    m.add_objective(gen.sum())
    return m


@pytest.fixture(scope="module")
def model() -> linopy.Model:
    m = build_sparse_network(250)
    A = m.matrices.A  # trigger + report the explicit-zero fill (#814)
    zeros = A.nnz - int(np.count_nonzero(A.data))
    print(
        f"\nmatrices.A rows={A.shape[0]} cols={A.shape[1]} "
        f"stored_nnz={A.nnz} structural_nnz={A.nnz - zeros} "
        f"({100 * zeros / A.nnz:.1f}% explicit zeros)"
    )
    return m


def test_to_highspy(benchmark, model: linopy.Model) -> None:
    benchmark(lambda: linopy.io.to_highspy(model))


if __name__ == "__main__":
    # Run this file as its own pytest session; ``-o addopts=`` drops any config
    # inherited from a surrounding repo so the script is self-contained.
    raise SystemExit(
        pytest.main(
            [__file__, "-q", "--benchmark-only", "-o", "addopts=", "-p", "no:cacheprovider", *sys.argv[1:]]
        )
    )

Options

  • Report upstream to ERGO-Code/HiGHS — the slow path is addRows.
  • Cap/hold highspy (keep benchmarks on 1.13.1; consider !=1.15.0 in the solvers extra) until upstream is resolved, or acknowledge the CodSpeed baseline step.
  • Independent of this: Speed up direct solver handoff by dropping explicit zeros from the constraint matrix #814 removes the ~98.8% explicit zeros we hand addRows in the first place, which makes linopy insensitive to this regression (and faster on every highspy version). That's an optimisation, not the fix for the upstream slowdown, so it's tracked separately.
All regressed benchmarks (BASE 848366b / 1.13.1 → HEAD e861678 / 1.15.0)
Benchmark BASE HEAD Efficiency
test_to_solver[highs-sparse_network-n=250] 83.5 ms 162.7 ms -48.7%
test_to_solver[highs-kvl_cycles-severity=100] 674.5 ms 916.2 ms -26.38%
test_to_solver[highs-cumsum-severity=100] 177.5 ms 217.2 ms -18.27%
test_to_solver[highs-rolling-severity=50] 540.3 ms 658.5 ms -17.96%
test_to_solver[highs-cumsum-severity=50] 44.4 ms 53.4 ms -16.82%
test_to_solver[highs-kvl_cycles-severity=0] 1.2 s 1.4 s -14%
test_to_solver[highs-kvl_cycles-severity=50] 976.7 ms 1,122.3 ms -12.97%

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdependenciesPull requests that update a dependency fileperformanceThis improves performance while not (meaningfully) altering behaviour for users

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions