test: add SQL test coverage for spark.sql.legacy.timeParserPolicy#4183
Open
andygrove wants to merge 2 commits intoapache:mainfrom
Open
test: add SQL test coverage for spark.sql.legacy.timeParserPolicy#4183andygrove wants to merge 2 commits intoapache:mainfrom
andygrove wants to merge 2 commits intoapache:mainfrom
Conversation
Audit every Spark expression that reads spark.sql.legacy.timeParserPolicy (date_format, from_unixtime, unix_timestamp, to_unix_timestamp, to_timestamp, to_date, and Spark 4's try_to_timestamp) and add CometSqlFileTestSuite coverage. For each expression provide: - a ConfigMatrix file exercising convergent inputs under LEGACY, CORRECTED, and EXCEPTION - per-policy files locking in divergent behavior (lenient parsing under LEGACY, null returns under CORRECTED, INCONSISTENT_BEHAVIOR_CROSS_VERSION under EXCEPTION) Also add docs/source/contributor-guide/spark_configs_support.md modeled on the expression audit log to track Spark configs that affect Comet behavior, with full audit notes for the timeParserPolicy entry. All 42 generated tests pass on Spark 3.4.3, 3.5.8, and 4.0.1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Part of #4180
Rationale for this change
spark.sql.legacy.timeParserPolicy(LEGACY/CORRECTED/EXCEPTION) controls which datetime parser Spark uses and changes results materially on lenient inputs and ambiguous patterns. No existing Comet SQL test exercises this config, so we have no regression net for the seven expressions that read it. This PR closes that gap.What changes are included in this PR?
For each Spark expression that reads the policy (
date_format,from_unixtime,unix_timestamp,to_unix_timestamp,to_timestamp/to_timestamp_ntz,to_date, and Spark 4'stry_to_timestamp):LEGACY,CORRECTED, andEXCEPTION.*_legacy.sql,*_corrected.sql,*_exception.sql) covering divergent inputs: single-digit fields under fixed-width patterns, out-of-range month/day, trailing characters, legacy-only pattern tokens likeaaaa, and theINCONSISTENT_BEHAVIOR_CROSS_VERSIONexception paths.A new contributor-guide page
spark_configs_support.mdmirrors the expression audit log: it tracks Spark configs that affect Comet behavior and records the full audit notes forspark.sql.legacy.timeParserPolicy(source semantics, affected expressions, current Comet status, test layout, findings).This PR was scaffolded with the project's
audit-comet-expressionworkflow extended to a config-level audit, plus thesuperpowers:brainstormingandsuperpowers:using-git-worktreesskills.How are these changes tested?
CometSqlFileTestSuiteruns the 42 generated test cases through both Spark and Comet and compares results. Verified locally:./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite time_parser_policy" -Dtest=none-- 42/42 pass on Spark 3.5.8 (default)../mvnw test -Pspark-3.4 -Dsuites="org.apache.comet.CometSqlFileTestSuite time_parser_policy" -Dtest=none-- 42/42 pass../mvnw test -Pspark-4.0 -Dsuites="org.apache.comet.CometSqlFileTestSuite try_to_timestamp_time_parser_policy" -Dtest=none-- 6/6 pass;to_timestamp_time_parser_policy_exceptionalso verified on 4.0.No Comet bugs were uncovered by the audit.