fix(postgres): correctly infer nullability for LEFT JOIN rewritten as RIGHT JOIN#4285
Open
luca992 wants to merge 3 commits into
Open
fix(postgres): correctly infer nullability for LEFT JOIN rewritten as RIGHT JOIN#4285luca992 wants to merge 3 commits into
luca992 wants to merge 3 commits into
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PostgreSQL's planner may execute
A LEFT JOIN Bas a hash join withJoin Type: Rightto put the smaller relation on the hash-build side.This is documented behavior:
[Postgres docs, 14.3 Controlling the Planner with Explicit JOIN Clauses]
[Postgres Pro, "Queries in PostgreSQL: 6. Hashing"]
After the swap the SQL right operand (the nullable side under LEFT JOIN semantics) ends up as the plan's
Outerchild instead of theInnerchild. The oldvisit_planonly markedInnerchildren nullable. On aJoin Type: Rightplan that produced two distinct failures:Option<T>in macro output for NOT NULL columnsThe old
visit_planalso only inspected children of outer joins, never the join node's ownOutput. Computed expressions likeb.x || 'y'orCOALESCE(b.x, NULL)materialize at the join node itself; the nullable child only carries raw column refs, so nullability inference silently dropped these columns. Postgres further deparses these expressions with an extra outer paren pair at root (((expr))) compared to the join node ((expr)), so exact-string matching missed them.Fix threads
in_nullablethroughvisit_planand picks the NULL-fill side from each join's type:Left→Innerchild is nullableRight→Outerchild is nullableFull→ both children are nullableIt also walks the join node's own
Outputand marks any entry that mentions a qualified column (alias.col, per PG manual §4.1.1) drawn from the nullable-side subtree's leaves. That catches computed expressions likeb.x || 'y'orCOALESCE(b.x, …)that only materialize at the join level, and leaves sibling SubPlan / InitPlan outputs ((SubPlan N)) alone. Subplans are computed independently of the join's NULL extension, so their nullability is genuinely unknown to this pass. Output comparison normalizes redundant outer parens (Postgres-lexical tokenizer for'…',"…",E'…',B'…'/X'…',U&'…'/U&"…",$tag$…$tag$) so root((expr))matches join(expr).Recursion now descends into all child plans, not only when the current node is
Left/Right, so nested joins reached through non-join intermediates likeHashare walked.Does your PR solve an issue?
fixes #3202
Is this a breaking change?
Yes. The old behavior was wrong inference.
For queries with a LEFT JOIN that postgres rewrites as a Hash Right Join (driven by
pg_statisticcost estimates, fires on production-sized data withplan_cache_mode = force_generic_planwhichsqlx-macros-coreitself sets), generated types change in three places:Columns from the SQL left operand that are NOT NULL in their base table go from
Option<T>toT. Downstream code handling these unneeded nulls would need to be dropped.Columns from the SQL right operand (the LEFT JOIN nullable side) go from
TtoOption<T>. Code treating these as non-null was already at risk ofunexpected null; try decoding as an Optionpanics at runtime whenever a row had no matching join partner. With this fix it stops compiling and the field needs to becomeOption<T>.Computed expressions on the nullable side of any outer join (
COALESCE(b.x, …),b.x || y, function calls,CASE, etc.) go fromTtoOption<T>for the same reason. Same latent panic risk, same migration.The right-operand and computed-expression cases are the bigger ones. Code passed type checks before only because LEFT JOINs happened to always find a match in test data. After this fix the type system exposes the real nullability.