Antalya 26.1 - Add ALTER TABLE MODIFY SETTING support for Hybrid watermarks #1659
Antalya 26.1 - Add ALTER TABLE MODIFY SETTING support for Hybrid watermarks #1659mkmkme wants to merge 4 commits intoantalya-26.1from
Conversation
Introduce hybridParam('name', 'type') pseudo-function for Hybrid engine
predicates, allowing segment boundaries to be changed at runtime via
ALTER TABLE ... MODIFY SETTING without recreating the table.
Key design decisions:
- hybridParam() takes exactly two arguments (name, type); all values
must be provided via ENGINE SETTINGS, keeping the API surface minimal
and eliminating default-value complexity.
- Watermark names must start with 'hybrid_watermark_' and exactly match
a declared hybridParam() in the predicates. Typos in both CREATE
SETTINGS and ALTER are rejected.
- Values are validated against the declared type at CREATE and ALTER
time, so invalid values never reach the runtime substitution path.
- Only hybrid_watermark_* settings are allowed on Hybrid tables;
regular DistributedSettings and RESET SETTING are rejected.
- Runtime watermark state uses MultiVersion<std::map> for lock-free
reads and deterministic serialization order in SHOW CREATE TABLE.
- In StorageDistributed::alter(), the watermark snapshot is published
before in-memory metadata to prevent concurrent readers from
observing new metadata with stale watermark values.
- Predicate validation at CREATE time substitutes the effective
SETTINGS value (not the type default) so value-sensitive expressions
are checked against realistic data.
Tests:
- Stateless tests covering CREATE, ALTER, DETACH/ATTACH persistence,
multi-watermark, type conflict, invalid values, typo rejection,
RESET SETTING rejection, and DistributedSettings rejection.
- Integration tests covering single-node, cross-node, and cluster()
table function flows, each self-contained with own setup/teardown.
Documentation updated with hybridParam() syntax, ALTER examples,
multi-watermark usage, and restriction notes.
Made-with: Cursor
|
AI audit note: This review comment was generated by AI (gpt-5.4). Audit update for PR #1659 (ALTER TABLE MODIFY SETTING support for Hybrid watermarks): Confirmed defects:Medium: ALTER MODIFY SETTING skips full predicate revalidation Example: CREATE TABLE t
ENGINE = Hybrid(
remote('localhost:9000', currentDatabase(), 'hot'),
dateDiff(hybridParam('hybrid_watermark_unit', 'String'), ts, now()) < 30,
remote('localhost:9000', currentDatabase(), 'cold'),
dateDiff(hybridParam('hybrid_watermark_unit', 'String'), ts, now()) >= 30
)
SETTINGS hybrid_watermark_unit = 'day'
AS hot;This CREATE should pass because But this ALTER is currently accepted too: ALTER TABLE t MODIFY SETTING hybrid_watermark_unit = 'banana';ALTER only validates that dateDiff('banana', ts, now()) < 30which is semantically invalid for Coverage summary: |
|
The above comment is a very good point by FWIW the previous |
|
Kinda work with ALIAS column, but it's not smart enough to correctly recognize (_table = xxxx AND ....) OR (_table = yyy AND ....) https://fiddle.clickhouse.com/c667b318-1b36-45e7-ad14-b6db1fb55ff7 |
Usability Audit: ALTER TABLE MODIFY SETTING for Hybrid watermarks (PR #1659)Scope: Summary of dedup vs PR commentsBefore filing, every issue below was checked against the full PR thread (
Other PR-thread items noted but not listed above:
Primary flows (ranked by expected usage)
IssuesFlow 2: Runtime watermark modification2.1 ALTER accepts a watermark value that makes the predicate semantically invalid; next read failsStatus: already raised (AI
2.2 Failure from a bad ALTER surfaces as a cryptic function-level error on readStatus: new in this audit. Not in PR comments.
2.3 Once set, a watermark cannot be removed except by recreating the tableStatus: new in this audit. Not in PR comments.
Flow 1: Default Hybrid with watermark (create + SELECT)1.1 Read path indexes
|
|
I have an update on The premise
The first half is correct: CREATE does run Why
|
|
@alsugiliazova I'll try to briefly address all of the found bulletpoints in your report. As sources, I used discussions with Misha and summaries of planning in Cursor.
See the explanation above. At this point it's going to probably stay as a known limitation.
This should probably also stay as a known limitation for now due to CH implementation limitations.
This should be fine as this ultimately changes the table itself. In that case it should be re-created from scratch.
Fixed
Fixed
This is quite a significant change outside the scope of this PR, I would leave it as-is for now.
It's quite simple to implement, but the change would be within several hundreds lines of new code. I would personally defer it to a follow-up PR for the next refresh. If strongly needed, I can implement it here. But I doubt it considering the Low priority of the issue in the report.
This is also fine since it requires changing a predicate and hence the whole table. |
|
Introduce
hybridParam('name', 'type')pseudo-function for Hybrid enginepredicates, allowing segment boundaries to be changed at runtime via
ALTER TABLE ... MODIFY SETTINGwithout recreating the table.Key design decisions:
hybridParam()takes exactly two arguments (name,type); all valuesmust be provided via ENGINE SETTINGS, keeping the API surface minimal
and eliminating default-value complexity.
'hybrid_watermark_'and exactly matcha declared
hybridParam()in the predicates. Typos in bothCREATE SETTINGSandALTERare rejected.CREATEand ALTERtime, so invalid values never reach the runtime substitution path.
regular DistributedSettings and RESET SETTING are rejected.
reads and deterministic serialization order in SHOW CREATE TABLE.
before in-memory metadata to prevent concurrent readers from
observing new metadata with stale watermark values.
SETTINGS value (not the type default) so value-sensitive expressions
are checked against realistic data.
Example:
Syntax sugar:
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added support for moving Hybrid table watermarks
CI/CD Options
Exclude tests:
Regression jobs to run: