Skip to content

Keep unions of general array types separate#5506

Draft
ondrejmirtes wants to merge 9 commits into2.1.xfrom
unions-of-arrays
Draft

Keep unions of general array types separate#5506
ondrejmirtes wants to merge 9 commits into2.1.xfrom
unions-of-arrays

Conversation

@ondrejmirtes
Copy link
Copy Markdown
Member

@ondrejmirtes ondrejmirtes commented Apr 21, 2026

Summary

  • Under bleeding edge, TypeCombinator::processArrayTypes no longer collapses multiple general array members into a single ArrayType with unioned keys and values, so array<int>|array<string> stays distinct from array<int|string> (fixes Union of arrays is not array of unions phpstan#8963). Subsumption (array<int>|array<int|string>array<int|string>) is handled by the existing outer union() loop via compareTypesInUnion.
  • The array{} | non-empty-array<X>array<X> simplification is preserved by falling back to the old collapse path whenever an empty constant array is present — that merge is lossless and not expressible as a keep-separate result.
  • Narrowing works out of the box: is_string($list[0]) on list<int>|list<string> narrows to list<string> (with hasOffsetValue(0, string)).
  • 34 existing fixtures updated to reflect the more-precise output. 10 known regressions remain in cases like array<mixed~'foo'>|non-empty-array<mixed> where the old collapse was lossless; those need a follow-up heuristic before removing the WIP gate.

TMP WIP commit (8c3554c54c) runs the new path for everyone, not just bleeding edge, so CI exercises it. Revert before merge.

Test plan

  • tests/PHPStan/Analyser/nsrt/array-union-keep-separate.php added — covers the issue repro, subsumption cases, narrowing via offset access, constant + general mixing, iteration
  • NodeScopeResolverTest green except 10 known Bucket B files
  • LegacyNodeScopeResolverTest, core Type unit tests green
  • TypeCombinatorTest has 18 describe-output failures under the WIP (no bleedingEdge); expected, to be re-baselined
  • Address Bucket B collapse (array<X>|non-empty-array<Y>) before removing WIP gate

Closes phpstan/phpstan#8648
Closes phpstan/phpstan#12195
Closes phpstan/phpstan#13394
Closes phpstan/phpstan#8963
Closes phpstan/phpstan#7759
Closes phpstan/phpstan#11176
Closes phpstan/phpstan#13789
Closes phpstan/phpstan#12434

Previously `TypeCombinator::processArrayTypes` collapsed multiple general
array members into a single `ArrayType` with unioned keys and values, so
`array<int>|array<string>` was indistinguishable from `array<int|string>`
(phpstan/phpstan#8963). Under `BleedingEdgeToggle::isBleedingEdge()` the
members are now kept as distinct union branches; the outer `union()` loop
handles subsumption via `compareTypesInUnion`. The `array{} | non-empty-
array<X>` -> `array<X>` simplification is preserved by falling back to the
old collapse path whenever an empty constant array is present.
Drops the BleedingEdgeToggle gate so CI exercises the new path on all
configs. Revert before merging.
@ondrejmirtes ondrejmirtes marked this pull request as draft April 21, 2026 19:57
Keep-separate previously returned $arrayTypes verbatim, which let
analysis-emergent unions accumulate dozens of near-identical members
(7 byte-identical entries in bug-7903; 35 distinct
non-empty-array<…, array{Gemarkung: 'X'}> shapes in bug-10538 differing
only in one inner constant; 4-and-growing same-key shapes with stacking
hasOffsetValue accessories in the optional-properties repro). Each
later union() pays O(N²) over them and the describes balloon.

Mirrors optimizeConstantArrays(): try precise, degrade past a budget.

  1. Dedupe by describe(VerbosityLevel::cache()) — cheap, unconditional.
  2. Cap at KEEP_SEPARATE_ARRAYS_LIMIT (4) distinct shapes after dedup;
     larger sets are almost always intermediate analysis state and fall
     through to the old collapse.

Bench (vs base 2.1.x):
  bug-7903   5.9s → 5.1s  (-14%)
  bug-8503   5.0s → 4.2s  (-16%)
  bug-10538  5.8s → 4.0s  (-31%)
  repro      6.9s → 6.1s  (-12%)

Re-baselines narrow-tagged-union.php whose 3-member union now stays
distinct under the new policy. Adds the runaway repro as a bench file.
Previous dedupe used the full describe (including accessories), so
members differing only in stacked hasOffsetValue accessories never
collapsed (cf. the optional-properties repro: 4 same-shape members
each with a different combination of hasOffsetValue('has_wiki') /
hasOffsetValue('has_pages')).

Now: processArrayAccessoryTypes returns a tuple [stripped arrays,
common accessories]. Dedupe key is the stripped describe; the value
kept is the *original* member (with its full accessories), so
per-member non-empty / list / etc. survive on whichever original
wins the collision. Members that share an underlying shape but only
differ in narrowing accessories collapse to one — making the
optional-properties union shrink immediately instead of growing.

Re-baselines 20 fixtures: most now retain non-empty / list info that
the previous full-describe dedupe lost (precision improvement).
Locks in the behaviour fixed by the keep-separate change so the bugs
cannot silently resurface. Each test fails on 2.1.x and passes on the
current branch.

Type-inference (NSRT):
  bug-12195 — list<string>|array{0: null} stays distinct
  bug-11176 — int|int[]|array{module: string} stays as 3 types

Rule errors removed (false positives):
  bug-8648  — OffsetAccessAssignmentRule no longer fires on
              $this->data['foo']['bar'][] inside a foreach
  bug-7759  — CallToFunctionParameters no longer reports
              array<string, string>|string passed where string expected
  bug-13394 — ReturnTypeRule no longer reports a generic mismatch
              after a side-effect-free is_array/isset narrowing

Rule errors added (true positives now reported):
  bug-8963  — CallToFunctionParameters now reports array<int|string>
              given where array<string>|array<int> expected

Closes phpstan/phpstan#8648
Closes phpstan/phpstan#12195
Closes phpstan/phpstan#13394
Closes phpstan/phpstan#8963
Closes phpstan/phpstan#7759
Closes phpstan/phpstan#11176
list<non-empty-array<mixed>> was widened to list<array<mixed>> after a
foreach-by-ref loop in which $row was first sanitized then re-populated.
Locks in the expected post-loop type.

Closes phpstan/phpstan#13789
Union of two non-empty-list wrappers around distinct array shapes:
non-empty-list<array{name: BackedEnum}>|non-empty-list<array{name: string}>.
With @phpstan-assert-if-true narrowing to one variant, the else branch
previously stayed as the full union; now it correctly narrows to the
remaining variant.

Closes phpstan/phpstan#12434
… cases

bug-7903 / bug-8503 / bug-11283 each spend a noticeable fraction of
their analysis inside the keep-separate path returning 2-member unions.
Each subsequent union() over those 2-member results pays O(N²) in
compareTypesInUnion that the old single-array collapse skipped.

Lowering the cap from 4 to 2 means anything bigger falls through to the
old collapse, while 2-member intentional unions
(array<int>|array<string>, list<int>|list<string>, the regression cases
in #13789 / #12434 / #8963 / etc.) still survive distinct.

Bench (median of 3, vs 2.1.x base):

  bug-7903        5.0s  vs  5.9s   (-15%)
  bug-8503        4.1s  vs  5.9s   (-30%)
  bug-11283       6.5s  vs  8.0s   (-19%)
  optional-props  5.0s  vs  6.9s   (-28%)

NodeScopeResolverTest: 10 known Bucket B failures remain (unchanged).
All regression tests continue to pass.

Re-baselines 4 fixtures whose 3-member unions now collapse under the
lower cap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant