[SPARK-57716][SQL][TESTS] Add SORT / ORDER BY and window correctness tests for nanosecond-precision timestamp types by stevomitric · Pull Request #56811 · apache/spark

stevomitric · 2026-06-26T12:56:27Z

What changes were proposed in this pull request?

Tests-only coverage for SORT / ORDER BY and window functions over the nanosecond-precision
timestamp types TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) (p in [7, 9]):

TimestampNanosWindowSuiteBase (+ ANSI on/off): row_number / rank / dense_rank / lag / lead over a nanosecond ordering key, NTZ and LTZ, whole-stage codegen on and off. No existing test asserts a window function over a timestamp ordering key.
TimestampNanosSortSuiteBase (+ ANSI on/off): the DataFrame/SQL scenarios not already covered generically by OrderingSuite / SortSuite (which exercise these types via DataTypeTestUtils.atomicTypes) — a public-API ORDER BY sub-microsecond tie-break smoke test, mixed-precision UNION ordering, a vectorized-ORC-read-then-sort, and an intercept documenting that caching a nanosecond column is not supported yet.
Golden coverage: a short ORDER BY + row_number + lead section appended to timestamp-ntz-nanos.sql / timestamp-ltz-nanos.sql.

Why are the changes needed?

SORT / ORDER BY and window functions already work over the nanosecond types (they ride on the orderability / hashing / UnsafeRow primitives), but had no dedicated test coverage. This locks the behaviour in, mirroring the MIN/MAX follow-up which was likewise tests-only. Joins are deferred to a later change.

Does this PR introduce any user-facing change?

No, tests only.

How was this patch tested?

New suites.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

…sion timestamp types End-to-end SORT / ORDER BY and window-function coverage over the nanosecond timestamp types TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) (p in 7..9). These already work (they ride on the orderability / hashing / UnsafeRow primitives); this locks the behaviour in, mirroring the MIN/MAX follow-up which was likewise tests-only. Joins are deferred to a later change. - TimestampNanosWindowSuiteBase (+ ANSI on/off): row_number / rank / dense_rank / lag / lead over a nanosecond ordering key, NTZ and LTZ, codegen on and off. No existing test asserts a window function over a timestamp ordering key. - TimestampNanosSortSuiteBase (+ ANSI on/off): the DataFrame / SQL scenarios not already covered generically by OrderingSuite / SortSuite (which exercise the nanos types via DataTypeTestUtils.atomicTypes) -- a public-API ORDER BY tie-break smoke test, mixed-precision UNION ordering, a vectorized-ORC-read- then-sort, and an intercept documenting that caching a nanosecond column is not supported yet. - Golden coverage: a short ORDER BY + row_number + lead section appended to timestamp-ntz-nanos.sql / timestamp-ltz-nanos.sql. Tests only; no production change. Co-authored-by: Isaac

stevomitric · 2026-06-26T13:13:10Z

cc @MaxGekk PTAL.

MaxGekk

0 blocking, 2 non-blocking, 0 nits.
Clean, well-documented tests-only PR. The suites are built so they genuinely observe nanosecond ordering (tie-break fixtures share epochMicros and differ only in nanosWithinMicro, so the distinct row-number / rank / strict-ordered collect is the real proof), assertions are correct, placement matches the existing TimestampNanos*SuiteBase family, and the cache intercept correctly documents the current gap. I independently confirmed the six infrastructure claims the comments rely on (testing-default flag, atomicTypes membership, ORC vectorized nanos getters, the cache-write throw path, findWiderDateTimeType widening, and the golden-file session-zone / bare-LTZ-literal round-trip) against the current source. Two minor non-blocking notes only.

Design / architecture (1)

PR description: the body says this "mirrors the MIN/MAX follow-up which was likewise tests-only," but there is no MinMax suite in the tree — MIN/MAX coverage appears to live inside TimestampNanosFunctionsSuiteBase. Minor description imprecision; not load-bearing.

Suggestions (1)

TimestampNanosWindowSuiteBase.scala:101: LTZ row_number test covers only ASC while its NTZ sibling covers ASC+DESC — see inline.

MaxGekk · 2026-06-26T16:17:10Z

+    }
+  }
+
+  test("row_number over a nanosecond TIMESTAMP_LTZ orders by the sub-microsecond part") {


Minor (non-blocking): this LTZ row_number test asserts only the ASC ordering, whereas its NTZ sibling (row_number over a nanosecond TIMESTAMP_NTZ ...) asserts both ASC and DESC. The comparator is type-shared so the risk is low, but adding the DESC arm here would make the NTZ/LTZ pair symmetric and guard the LTZ descending path explicitly.

stevomitric changed the title ~~[SQL][TESTS][WIP] Add SORT / ORDER BY and window correctness tests for nanosecond-precision timestamp types~~ [SPARK-57716][SQL][TESTS] Add SORT / ORDER BY and window correctness tests for nanosecond-precision timestamp types Jun 26, 2026

MaxGekk reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-57716][SQL][TESTS] Add SORT / ORDER BY and window correctness tests for nanosecond-precision timestamp types#56811

[SPARK-57716][SQL][TESTS] Add SORT / ORDER BY and window correctness tests for nanosecond-precision timestamp types#56811
stevomitric wants to merge 1 commit into
apache:masterfrom
stevomitric:stevomitric/timestamp-nanos-sort-tests

stevomitric commented Jun 26, 2026

Uh oh!

stevomitric commented Jun 26, 2026

Uh oh!

MaxGekk left a comment

Uh oh!

MaxGekk Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

stevomitric commented Jun 26, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

stevomitric commented Jun 26, 2026

Uh oh!

MaxGekk left a comment

Choose a reason for hiding this comment

Design / architecture (1)

Suggestions (1)

Uh oh!

MaxGekk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants