APP-1316: add BigQuery expiration to elementary test temp tables#1030
Conversation
Add a 12-hour expiration_timestamp to test temp tables created by create_elementary_test_table on BigQuery. This acts as a safety net when on_run_end cleanup fails due to transient errors (e.g. rate limiting), preventing orphan table accumulation. Uses a new dispatched macro set_test_table_expiration that is a no-op for all adapters except BigQuery, where it runs ALTER TABLE SET OPTIONS. Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
👋 @MikaKerman |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR threads ChangesExpiration Hours Plumbing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/set_test_table_expiration.sql`:
- Line 12: The `set_test_table_expiration` test utility is invoking
`elementary.run_query` for a BigQuery DDL statement, which should not lowercase
returned column names. Update the `run_query` call in
`set_test_table_expiration.sql` to pass `lowercase_column_names=False` for the
`expiration_query` path so the `ALTER TABLE` wrapper behaves correctly for
DDL/DML results.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 261e3095-b620-4525-9dd3-34f0ca2309b2
📒 Files selected for processing (2)
macros/edr/tests/test_utils/create_elementary_test_table.sqlmacros/edr/tests/test_utils/set_test_table_expiration.sql
Instead of a separate dispatch macro + ALTER TABLE to set expiration on BigQuery test tables, thread an expiration_hours parameter through the existing create_or_replace → edr_create_table_as → edr_get_create_table_as_sql chain. BigQuery's implementation adds OPTIONS(expiration_timestamp=...) at CREATE time when expiration_hours is set, avoiding a second API call. Other adapters silently ignore the parameter. TTL changed from 12 hours to 1 hour to match existing patterns. Refs: APP-1316 Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
dbt's Jinja does not silently accept extra kwargs - all dispatched implementations must declare the parameter even if they don't use it. Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/utils/table_operations/create_or_replace.sql`:
- Around line 1-5: Update the adapter-specific overrides of create_or_replace so
they accept expiration_hours consistently with the base macro. The current
create_or_replace dispatcher forwards expiration_hours, but redshift__,
postgres__, spark__, fabricspark__, athena__, trino__, clickhouse__, and
duckdb__ still only take temporary, relation, and sql_query, which will break
when a non-default value is passed. Add expiration_hours (or a kwargs catch-all)
to each of those macro signatures and keep the forwarded call chain aligned with
create_or_replace.
In `@macros/utils/table_operations/create_table_as.sql`:
- Around line 24-30: The dispatched create-table-as implementations are missing
the new expiration_hours kwarg, so edr_get_create_table_as_sql will break on
adapters that only accept temporary, relation, and sql_query. Update every
dispatched macro implementation referenced by
adapter.dispatch("edr_get_create_table_as_sql", "elementary")—including
default__, snowflake__, postgres__, redshift__, databricks__, clickhouse__,
duckdb__, trino__, spark__, and fabricspark__—to accept expiration_hours (or a
catch-all **kwargs) and thread it through consistently where needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9ff75cb6-31be-4f2e-b91d-7753f507de3e
📒 Files selected for processing (3)
macros/edr/tests/test_utils/create_elementary_test_table.sqlmacros/utils/table_operations/create_or_replace.sqlmacros/utils/table_operations/create_table_as.sql
Same fix as previous commit - dbt Jinja requires all dispatched implementations to declare keyword arguments that the dispatcher passes. Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
macros/utils/table_operations/create_table_as.sql (1)
143-150: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winForward
expiration_hoursin Fabric Spark delegation.
fabricspark__edr_get_create_table_as_sqlacceptsexpiration_hoursbut drops it when delegating tospark__edr_get_create_table_as_sql, which breaks the new plumbing contract on this adapter path.Suggested fix
{% macro fabricspark__edr_get_create_table_as_sql( temporary, relation, sql_query, expiration_hours=none ) %} {{ return( elementary.spark__edr_get_create_table_as_sql( - temporary, relation, sql_query + temporary, relation, sql_query, expiration_hours=expiration_hours ) ) }} {% endmacro %}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@macros/utils/table_operations/create_table_as.sql` around lines 143 - 150, The fabricspark__edr_get_create_table_as_sql macro is dropping the expiration_hours argument when delegating to elementary.spark__edr_get_create_table_as_sql, which breaks the adapter contract. Update the delegation in fabricspark__edr_get_create_table_as_sql so it forwards expiration_hours along with temporary, relation, and sql_query, keeping the signature and downstream behavior consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@macros/utils/table_operations/create_table_as.sql`:
- Around line 143-150: The fabricspark__edr_get_create_table_as_sql macro is
dropping the expiration_hours argument when delegating to
elementary.spark__edr_get_create_table_as_sql, which breaks the adapter
contract. Update the delegation in fabricspark__edr_get_create_table_as_sql so
it forwards expiration_hours along with temporary, relation, and sql_query,
keeping the signature and downstream behavior consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 35786502-92fc-4256-a8d7-67e89879385d
📒 Files selected for processing (1)
macros/utils/table_operations/create_table_as.sql
Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
| {# Create the table if it doesnt exist #} | ||
| {%- do elementary.create_or_replace(false, temp_table_relation, sql_query) %} | ||
| {%- do elementary.create_or_replace( | ||
| false, temp_table_relation, sql_query, expiration_hours=1 |
There was a problem hiding this comment.
Add a comment here that this table needs to survive cross-session, so it can't be temp, but we can set expiration hours for adapters that support it (currently only BQ)
Co-Authored-By: mika@elementary-data.com <mika.kerman@gmail.com>
Summary
Adds a 1-hour
expiration_timestampto BigQuery test temp tables at CREATE time as a safety net against orphan accumulation whenon_run_endcleanup fails.Instead of a separate ALTER TABLE macro, threads an
expiration_hoursparameter through the existing infrastructure:On BigQuery, when
expiration_hoursis set andtemporaryis false, the CREATE statement includes the expiration option — single API call, no separate ALTER needed. Other adapters silently ignore the parameter via Jinja kwargs.Why not just change
temporary=falsetotemporary=true? On Snowflake/Postgres/Redshift,temporary=truecreates session-scoped temp tables that aren't visible across dbt threads — test tables created during parallel test execution wouldn't be readable fromon_run_endwhich runs on a different connection.Resolves APP-1316
Link to Devin session: https://app.devin.ai/sessions/2a53f516ec2142da97e0e3e0f75f0489
Requested by: @MikaKerman
Summary by CodeRabbit
expiration_hourssupport for creating tables, including through create-or-replace, and threaded through adapter implementations.expiration_hours=1, ensuring they expire after 1 hour.