Skip to content

fix-audit-405-r2: mirror PR #405 CV notation + bacon Parameters fix into sibling surfaces#444

Open
igerber wants to merge 1 commit into
mainfrom
fix-audit-405-r2
Open

fix-audit-405-r2: mirror PR #405 CV notation + bacon Parameters fix into sibling surfaces#444
igerber wants to merge 1 commit into
mainfrom
fix-audit-405-r2

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 15, 2026

Summary

Holistic-audit fix-PR for #405 (docs: fix 86 Sphinx ERRORs across docstring substitutions and indentation). Pilot at /tmp/pilot-405-wt cherry-picked #405 onto its pre-merge base and ran iterative codex review (R0 + R1 with gpt-5.4 + xhigh reasoning).

R0 surfaced one actionable P3 (cross-surface notation consistency): #405 changed coef_var docstrings from SE / |ATT| to SE / abs(ATT) (the bar form triggered Sphinx substitution-reference interpretation), but did not mirror the same change into the user-facing summary labels rendered by summary() / print(). R1 confirms clean ✅.

This PR mirrors PR #405's notation update into:

  • 14 summary call sites across 11 result modules — CV (SE/|ATT|):CV (SE/abs(ATT)): (and the dynamic chaisemartin label f"CV (SE/|{label}|):"f"CV (SE/abs({label})):")
  • BaconDecomposition.__init__ Parameters block at bacon.py:404 — blank line before bullet list (same fix docs: fix 86 Sphinx ERRORs (PR 1 of 2-PR cleanup) #405 already applied to bacon_decompose())
  • One cached T18 tutorial output cell that quoted the old label (no notebook re-execution needed since nbsphinx_execute = "never")

No methodology impact — pure rendered-surface consistency fix.

Test plan

  • CI green

…ram block

PR #405 changed coef_var docstrings from "SE / |ATT|" to "SE / abs(ATT)"
to avoid Sphinx interpreting |...| as a substitution reference, but did
not mirror the same change into the user-facing summary labels rendered
by summary() / print(). The result: docstrings now describe the metric
as "SE/abs(ATT)" while the terminal prints "CV (SE/|ATT|):" — two
labels for the same quantity.

Mirror the notation across all 14 summary call sites and update the
single cached T18 tutorial output that quotes the label:

- diff_diff/results.py (3 sites: DiD/MultiPeriod/SyntheticDiD)
- diff_diff/{continuous_did,efficient_did,imputation,stacked_did,
  staggered,staggered_triple_diff,trop,two_stage}_results.py (1 each)
- diff_diff/sun_abraham.py (1)
- diff_diff/chaisemartin_dhaultfoeuille_results.py (dynamic
  f"CV (SE/|{label}|):" -> f"CV (SE/abs({label})):")
- docs/tutorials/18_geo_experiments.ipynb (1 cached output line)

Also mirror PR #405's blank-line-before-bullet-list fix into
BaconDecomposition.__init__ at bacon.py:404, matching the same fix
already applied to bacon_decompose() at bacon.py:1086 and 1097 by
PR #405. This Parameters block is rendered through
docs/api/_autosummary/diff_diff.BaconDecomposition.rst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

✅ Looks good: highest unmitigated severity is P3/informational only.

Executive Summary

  • The diff is methodology-neutral: it only updates rendered CV labels in result summary() methods, one cached notebook output line, and a Sphinx docstring blank line. No estimator, weighting, variance, identification, or default behavior changes.
  • Cross-checking the touched result surfaces against the existing coef_var definitions shows the new labels accurately describe the computed quantity as SE / abs(...), e.g. diff_diff/results.py:L160-L252, diff_diff/results.py:L522-L659, diff_diff/results.py:L982-L1055, diff_diff/chaisemartin_dhaultfoeuille_results.py:L759-L897.
  • The change is consistent with the registry's result-surface contract and does not introduce an undocumented deviation from source material: docs/methodology/REGISTRY.md:L5-L5.
  • BaconDecomposition.__init__ is a docstring-formatting fix only; Goodman-Bacon decomposition logic is unchanged: diff_diff/bacon.py:L397-L407, docs/methodology/REGISTRY.md:L2598-L2607.
  • The cached Tutorial 18 SDiD output now matches runtime summary() output, eliminating doc drift: diff_diff/results.py:L990-L1055, docs/tutorials/18_geo_experiments.ipynb:L499-L503.

Methodology

  • Severity: P3. Impact: The affected methods are display surfaces for DiD, MultiPeriodDiD, SyntheticDiD, ContinuousDiD, EfficientDiD, ImputationDiD, StackedDiD, Callaway-Sant'Anna, StaggeredTripleDifference, Sun-Abraham, TROP, TwoStageDiD, and dCDH. Each changed line only updates the printed CV notation to match the already-implemented coef_var = SE / abs(...) definition; I found no change to estimands, weighting, variance/SE, assumption checks, or defaults. Concrete fix: None. References: docs/methodology/REGISTRY.md:L5-L5, diff_diff/results.py:L160-L252, diff_diff/results.py:L522-L659, diff_diff/results.py:L982-L1055, diff_diff/continuous_did_results.py:L196-L279, diff_diff/chaisemartin_dhaultfoeuille_results.py:L759-L897.
  • Severity: P3. Impact: The BaconDecomposition hunk is Sphinx-formatting only and does not alter Goodman-Bacon comparison weights or decomposition behavior. Concrete fix: None. References: diff_diff/bacon.py:L397-L407, docs/methodology/REGISTRY.md:L2598-L2607.

Code Quality

  • No findings. The diff does not add executable branches, inline inference code, new parameters, or new NaN/empty-set/control-group logic.

Performance

  • No findings. Pure string/docstring/notebook-output changes.

Maintainability

  • Severity: P3. Impact: The notation fix was applied across sibling result summaries instead of only one surface, reducing cross-surface drift and user confusion. Concrete fix: None. References: diff_diff/results.py:L250-L252, diff_diff/results.py:L657-L659, diff_diff/results.py:L1053-L1055, diff_diff/chaisemartin_dhaultfoeuille_results.py:L894-L897, diff_diff/continuous_did_results.py:L277-L279, diff_diff/efficient_did_results.py:L261-L263, diff_diff/imputation_results.py:L256-L258, diff_diff/stacked_did_results.py:L213-L215, diff_diff/staggered_results.py:L235-L237, diff_diff/staggered_triple_diff_results.py:L197-L199, diff_diff/sun_abraham.py:L191-L193, diff_diff/trop_results.py:L237-L239, diff_diff/two_stage_results.py:L254-L256.

Tech Debt

  • No findings. I did not see any new limitation that needs TODO.md tracking, and no TODO-tracked mitigation is being relied on here.

Security

  • No findings. No executable workflow, dependency, or secret-handling surface changed; the notebook delta is cached output text only (docs/tutorials/18_geo_experiments.ipynb:L499-L503).

Documentation/Tests

  • Severity: P3. Impact: Tutorial 18's cached SDiD summary output is now consistent with SyntheticDiDResults.summary(), so readers see the same label that runtime users get. Concrete fix: None. References: diff_diff/results.py:L990-L1055, docs/tutorials/18_geo_experiments.ipynb:L499-L503.
  • No findings on tests. Given the diff is non-functional, the absence of new tests does not block approval.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant