Skip to content

feat(sql): aggregates / GROUP BY / DISTINCT / HAVING over JOIN results (SQLR-6)#164

Merged
joaoh82 merged 1 commit into
mainfrom
sqlr-6-join-aggregates
Jun 10, 2026
Merged

feat(sql): aggregates / GROUP BY / DISTINCT / HAVING over JOIN results (SQLR-6)#164
joaoh82 merged 1 commit into
mainfrom
sqlr-6-join-aggregates

Conversation

@joaoh82

@joaoh82 joaoh82 commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Closes the SQLR-6 limitation left by #99: aggregates, GROUP BY, DISTINCT — and HAVING — now compose over JOIN results. SELECT customers.name, COUNT(*) FROM customers JOIN orders ON ... GROUP BY customers.name works across all four join flavors.

How

  • Scope-generic aggregator. aggregate_rows now consumes an iterator of RowScopes instead of (&Table, rowid), so the joined row stream feeds the same accumulator as the single-table path. Shared helpers (lower_having_into_hidden_slots, run_aggregation_pipeline) carry HAVING / DISTINCT / output-row ORDER BY / LIMIT identically on both paths.
  • Qualified GROUP BY keys. New GroupByKey { qualifier, name }GROUP BY customers.name disambiguates same-named columns; AggregateArg::Column keeps its qualifier too (SUM(orders.amount)).
  • Validation moved to where the schemas are. The bare-column-must-be-in-GROUP-BY check stays in the parser for single-table queries and runs in the executor for joined ones (resolve_scope_column), erroring on ambiguous / unknown references.
  • NULL semantics. NULL-padded outer-join rows group under a NULL key; COUNT(col) skips their NULLs while COUNT(*) counts them; SUM over an all-NULL group yields NULL.
  • DISTINCT over joins dedupes the projected output rows, with LIMIT deferred past the dedupe (mirrors the single-table path).

Bonus fixes

  • SELECT * FROM t GROUP BY c used to panic (expect("validated to be in GROUP BY") — parser validation skips Projection::All); now a clean "must appear in GROUP BY" error.
  • Stale "HAVING is not yet supported" error message (HAVING shipped in SQLR-52).
  • Stale doc claims removed (README still listed HAVING as unsupported; docs/usage.md still listed joins / GROUP BY / ALTER TABLE as unsupported).

Docs

README, docs/supported-sql.md, docs/sql-engine.md, docs/architecture.md, docs/roadmap.md, docs/usage.md, plus the site's SQL reference (web/src/components/sql-ref.tsx, web/src/app/docs/page.tsx).

Test plan

  • 15 new/updated executor tests: joined GROUP BY + aggregates, aggregates without GROUP BY, outer-join NULL grouping + COUNT(col) NULL-skipping, COUNT(DISTINCT) over join, qualified keys, ambiguous-key / bare-column / aggregate-in-WHERE error paths (the code-review gap from feat(sql): JOINs — INNER / LEFT / RIGHT / FULL OUTER (SQLR-5) #99), HAVING over all four flavors, ORDER BY on aggregates, DISTINCT + deferred LIMIT, and the SELECT * panic regression.
  • Full workspace suite green (637 engine tests + all crates), cargo fmt --check, cargo doc, clippy diffed vs main (no new warnings in touched files).
  • File-backed REPL smoke: seed → close → reopen → joined aggregate / DISTINCT queries return correct results.
  • npm run build in web/ passes.

🤖 Generated with Claude Code

…s (SQLR-6)

Generalize the SQLR-3 aggregation pipeline from (table, rowid) to the
RowScope trait, so the joined row stream feeds the same accumulator the
single-table path uses:

- aggregate_rows is now generic over an iterator of RowScopes; group
  keys and aggregate args resolve through scope.lookup, so NULL-padded
  outer-join rows group under NULL and COUNT(col) skips their NULLs.
- GROUP BY keys carry an optional t. qualifier (GROUP BY customers.name)
  via the new GroupByKey struct; AggregateArg::Column keeps its
  qualifier too (SUM(orders.amount)).
- The bare-column-must-be-in-GROUP-BY check stays in the parser for
  single-table queries and moves to the executor for joined ones, where
  qualifier resolution needs the schemas (resolve_scope_column).
- SELECT DISTINCT over a join dedupes the projected output rows, with
  LIMIT deferred past the dedupe (mirrors the single-table path).
- HAVING composes over joins through the shared
  lower_having_into_hidden_slots + run_aggregation_pipeline helpers.
- Bonus fix: SELECT * FROM t GROUP BY c used to panic on the
  'validated to be in GROUP BY' expect (parser validation skips
  Projection::All); it now surfaces the standard 'must appear in
  GROUP BY' error.
- Stale 'HAVING is not yet supported' error message and docs updated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
rust-sqlite Ready Ready Preview, Comment Jun 10, 2026 6:36am

Request Review

@joaoh82 joaoh82 merged commit 79eff1d into main Jun 10, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant