Skip to content

indexed() is undefined behavior for rank-0 (zero-axis) histograms #430

@henryiii

Description

@henryiii

If yo send up with a rank-0 histogram, while some operations support it, some operations can segfault. I ran into this in boost-histogram. It only segfaults on windows. I've got a fix at scikit-hep/boost-histogram#1153, but would be nice to natively support it too. I asked Claude to generate a report (below). I'm on a Mac so didn't do the "before you file" suggestion at the end - you can see the failures in both integration tests CI https://github.com/scikit-hep/integration-tests/actions/runs/27257489976/job/80494999483 and its own: https://github.com/scikit-hep/boost-histogram/actions/runs/27492667553, assume that is enough.

🤖 AI text below 🤖

Summary

Iterating a rank-0 histogram (zero axes, one cell) with indexed() — directly, or indirectly via algorithm::sum(h, coverage::inner) / algorithm::empty(h, ...) — reads uninitialized per-axis state in indexed_range::iterator::operator++. The read isn't flagged by ASan/UBSan (uninitialized stack read), so the loop usually terminates by luck and returns the correct result. With an unlucky stack layout the uninitialized end / end_skip values drive operator++ to advance the underlying iterator past end() (or walk c off the indices_ buffer), producing an out-of-bounds read and a crash.

We hit this as a hard crash in the Python bindings (scikit-hep/boost-histogram#1153): bh.Histogram() (rank-0) .sum() reproducibly segfaulted on free-threaded Windows (CPython 3.14t) while passing on every other platform/build — purely because that build's stack frames differed.

Affected: Boost 1.90/1.91 (develop 5e9b7b1); the relevant code is unchanged on develop.

Minimal reproducer

#include <boost/histogram.hpp>
#include <boost/histogram/algorithm/sum.hpp>
#include <vector>

namespace bh = boost::histogram;

int main() {
    // Rank-0 requires a *dynamic* axis container with an empty vector.
    using axes_t = std::vector<bh::axis::regular<>>;
    auto h = bh::make_histogram(axes_t{});       // rank() == 0, size() == 1

    bh::algorithm::sum(h, bh::coverage::all);     // OK: uses plain hist iteration
    bh::algorithm::sum(h, bh::coverage::inner);   // UB: drives indexed() over rank-0
}

Compiles and (usually) runs cleanly, but sum(..., inner) reads uninitialized memory. MemorySanitizer flags it; ASan/UBSan do not. Stability depends entirely on stack contents.

Root cause

For rank > 0, the indexed_range constructor initializes one index_data per axis inside a for_each_axis lambda (indexed.hpp:328). For rank 0:

  • for_each_axis runs zero times, so indices_[0] (idx, begin, end, begin_skip, end_skip) is never initialized — yet the backing slot exists in the std::array<index_data, buffer_size> and is read by operator++.
  • The empty-range guard if (begin_ == end_) return; (indexed.hpp:323) does not fire: a rank-0 histogram still has one cell, so begin_ != end_.

The single iteration then runs operator++ (indexed.hpp:251–268), reading the uninitialized idx/end/end_skip. Because indices_.end() == indices_.begin() for rank 0, the if (++c == indices_.end()) guard never matches once c is incremented, so the loop walks c forward through uninitialized records while iter_ += c->end_skip moves the underlying iterator by arbitrary amounts → OOB read.

Suggested upstream fixes

  1. Extend the early-out to cover rank 0 (treat it as a degenerate single-cell range, stepping operator++ straight to end_ without touching indices_).
  2. In operator++, short-circuit when indices_.begin() == indices_.end() (rank 0): just ++iter_.
  3. Have algorithm::sum/empty route rank-0 through the non-indexed path (as sum already does for coverage::all).

Downstream workaround (already applied)

scikit-hep/boost-histogram routes rank-0 around indexed(): sum() uses coverage::all, empty() checks the single cell directly. Rank-0 has no flow bins, so inner == all.


One caveat to flag before you file: I verified the root-cause analysis against the code and confirmed the path is stack-layout dependent (zero-init and uniform pattern-init both happen to terminate cleanly here), but I could not produce a portable always-crashing reproducer from macOS — that needs MemorySanitizer, which Apple clang doesn't ship. If you want a hard demonstration to include, the surest route is running the reproducer under MSan (a Linux clang build), or just cite the concrete downstream crash (free-threaded Windows). The maintainer (Hans Dembinski) will likely recognize it from the code analysis alone. Want me to also draft the MSan invocation, or adapt this into the exact GitHub issue body and open it via gh?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions