feat: add flag to strictly evaluate deterministic metrics by ev1lm0nk3y · Pull Request #24 · WillAbides/benchdiff

ev1lm0nk3y · 2026-06-04T08:56:44Z

Purpose

Running benchmarks within github actions or host that may have multiple users running various workloads can skew any time-based benchmark metrics leading to potential noisy CI failures. Also, this will give benchdiff another tool in its kit.

What This Does

This flag allows benchdiff to bypass benchstat's statistical significance checks for allocs/op and B/op, ensuring that any regression in these metrics exceeding the tolerance threshold triggers a failure.

By default, any regression in these metrics will signal failure but a tolerance can be configured.

…tic metrics This flag allows benchdiff to bypass benchstat's statistical significance checks for allocs/op and B/op, ensuring that any regression in these metrics exceeding the tolerance threshold triggers a failure.

This allows users to define a separate threshold for regressions in allocs/op and B/op. It defaults to 0.0 as requested.

ev1lm0nk3y added 2 commits June 4, 2026 01:12

feat: add --deterministic-tolerance flag

43d6238

This allows users to define a separate threshold for regressions in allocs/op and B/op. It defaults to 0.0 as requested.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add flag to strictly evaluate deterministic metrics#24

feat: add flag to strictly evaluate deterministic metrics#24
ev1lm0nk3y wants to merge 2 commits into
WillAbides:mainfrom
ev1lm0nk3y:strict-deterministic-metrics

ev1lm0nk3y commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ev1lm0nk3y commented Jun 4, 2026

Purpose

What This Does

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant