[WIP] Fix Iceberg REST catalog commit safety — corruption under post-commit network failure#1676
Open
il9ue wants to merge 1 commit intoantalya-26.1from
Open
[WIP] Fix Iceberg REST catalog commit safety — corruption under post-commit network failure#1676il9ue wants to merge 1 commit intoantalya-26.1from
il9ue wants to merge 1 commit intoantalya-26.1from
Conversation
Original ticket #1609 (storage leak cleanup on TRUNCATE commit failure) investigation revealed a latent table-corruption bug in the REST catalog client that the #1609 fix would propagate into TRUNCATE if written as originally scoped. Repurposing this branch to fix the root cause, with #1609 becoming the final layer on top. See design-notes.md for full investigation and layered fix proposal. Awaiting lead approval on scope expansion before implementation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Status: WIP / Draft — awaiting lead approval on scope. Do not merge.
Original ticket #1609 (storage leak cleanup on TRUNCATE commit failure)
investigation revealed a latent table-corruption bug in the REST catalog
client that the #1609 fix would propagate into TRUNCATE if written as
originally scoped.
Under a post-commit network failure, the REST catalog HTTP client retries
the commit POST, receives 409 Conflict (because the server already
advanced the snapshot), and
updateMetadatacollapses the 409 toreturn false. Callers interpret that as a clean pre-commit rejectionand run their cleanup path — deleting files the catalog now references.
Result: table corruption.
This affects
Mutations.cppin merged code today and would affectTRUNCATE if #1609 were implemented as originally described. Repurposing
this branch to fix the root cause across three layers, with #1609
becoming the final layer on top.
Scope
src/IO/ReadWriteBufferFromHTTP.cpp) — classify 409as non-retriable.
src/Databases/DataLake/RestCatalog.cpp) — replacebool updateMetadatawith a typed result distinguishingCommitted/RejectedCleanly/Unknown.Mutations.cpp,IcebergMetadata::truncate) — gatecleanup on
RejectedCleanlyonly; onUnknown, log and preservestate.
Evidence
Every claim cited to
file:lineindesign-notes.mdon this branch.Full investigation transcripts attached to escalation.
src/IO/ReadWriteBufferFromHTTP.cpp:22-34— 409 is retriablesrc/IO/ReadWriteBufferFromHTTP.cpp:519-522—impl.reset()forcesbody re-send on every retry
src/Databases/DataLake/RestCatalog.cpp:1347-1355— single catch,no status inspection
src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp:522,530— unconditional cleanup call sites
Violates Iceberg REST spec §3.5 (409 should be non-retriable; requires
metadata refresh, not retry).
Related
point; becomes Layer 3)
(pending lead approval)
See
design-notes.mdfor full investigation and layered fix proposal.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fixed a data loss bug in the Iceberg REST catalog client where a network
failure after a successful
updateMetadatacommit could cause the clientto delete object-storage files that the catalog now references, leaving
the table in a corrupted state. The HTTP client now classifies 409
Conflict as non-retriable,
updateMetadatareturns a typed result thatdistinguishes clean rejection from ambiguous outcomes, and callers
(Iceberg mutations and TRUNCATE) only clean up written files when the
commit is known to have been cleanly rejected. Also adds missing cleanup
of orphaned files in the TRUNCATE path (#1609).
Documentation entry for user-facing changes
No documentation changes required. This fix does not change the public
SQL surface or configuration. Internal HTTP retry behavior for 409
responses changes: REST catalog calls receiving 409 will no longer retry
and will surface the conflict to the caller on the first occurrence,
which is the correct Iceberg REST spec behavior.
CI/CD Options
Exclude tests:
Regression jobs to run: