feat(sql): aggregates / GROUP BY / DISTINCT / HAVING over JOIN results (SQLR-6)#164
Merged
Conversation
…s (SQLR-6) Generalize the SQLR-3 aggregation pipeline from (table, rowid) to the RowScope trait, so the joined row stream feeds the same accumulator the single-table path uses: - aggregate_rows is now generic over an iterator of RowScopes; group keys and aggregate args resolve through scope.lookup, so NULL-padded outer-join rows group under NULL and COUNT(col) skips their NULLs. - GROUP BY keys carry an optional t. qualifier (GROUP BY customers.name) via the new GroupByKey struct; AggregateArg::Column keeps its qualifier too (SUM(orders.amount)). - The bare-column-must-be-in-GROUP-BY check stays in the parser for single-table queries and moves to the executor for joined ones, where qualifier resolution needs the schemas (resolve_scope_column). - SELECT DISTINCT over a join dedupes the projected output rows, with LIMIT deferred past the dedupe (mirrors the single-table path). - HAVING composes over joins through the shared lower_having_into_hidden_slots + run_aggregation_pipeline helpers. - Bonus fix: SELECT * FROM t GROUP BY c used to panic on the 'validated to be in GROUP BY' expect (parser validation skips Projection::All); it now surfaces the standard 'must appear in GROUP BY' error. - Stale 'HAVING is not yet supported' error message and docs updated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the SQLR-6 limitation left by #99: aggregates,
GROUP BY,DISTINCT— andHAVING— now compose over JOIN results.SELECT customers.name, COUNT(*) FROM customers JOIN orders ON ... GROUP BY customers.nameworks across all four join flavors.How
aggregate_rowsnow consumes an iterator ofRowScopes instead of(&Table, rowid), so the joined row stream feeds the same accumulator as the single-table path. Shared helpers (lower_having_into_hidden_slots,run_aggregation_pipeline) carry HAVING / DISTINCT / output-row ORDER BY / LIMIT identically on both paths.GroupByKey { qualifier, name }—GROUP BY customers.namedisambiguates same-named columns;AggregateArg::Columnkeeps its qualifier too (SUM(orders.amount)).resolve_scope_column), erroring on ambiguous / unknown references.COUNT(col)skips their NULLs whileCOUNT(*)counts them;SUMover an all-NULL group yields NULL.Bonus fixes
SELECT * FROM t GROUP BY cused to panic (expect("validated to be in GROUP BY")— parser validation skipsProjection::All); now a clean "must appear in GROUP BY" error.HAVINGas unsupported;docs/usage.mdstill listed joins / GROUP BY / ALTER TABLE as unsupported).Docs
README,
docs/supported-sql.md,docs/sql-engine.md,docs/architecture.md,docs/roadmap.md,docs/usage.md, plus the site's SQL reference (web/src/components/sql-ref.tsx,web/src/app/docs/page.tsx).Test plan
COUNT(col)NULL-skipping,COUNT(DISTINCT)over join, qualified keys, ambiguous-key / bare-column / aggregate-in-WHERE error paths (the code-review gap from feat(sql): JOINs — INNER / LEFT / RIGHT / FULL OUTER (SQLR-5) #99), HAVING over all four flavors, ORDER BY on aggregates, DISTINCT + deferred LIMIT, and theSELECT *panic regression.cargo fmt --check,cargo doc, clippy diffed vsmain(no new warnings in touched files).npm run buildinweb/passes.🤖 Generated with Claude Code