Skip to content

Azure ADLS Gen2 backend — 3 bugs found: app role LocalFileSystem, concurrent writer 409, compactor manifest format-version mismatch #17

@imtiazqa

Description

@imtiazqa

Built coldfront-duckdb15:pg18 locally on Apple Silicon Mac (arm64), transferred the image to a Rocky Linux 10 arm64 VM via docker save | gzip | scp | docker load, then ran make compactor && ci/topo/vanilla.sh --pg 18 --mode tiered --backend azure --compose docker-compose.matrix-azure.yml.

Before running set these environment vraiables in the session:-

export COLDFRONT_AZURE_ACCOUNT="..."
export COLDFRONT_AZURE_FILESYSTEM="..."
export COLDFRONT_AZURE_KEY="..."
export COLDFRONT_AZURE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=${COLDFRONT_AZURE_ACCOUNT};AccountKey=${COLDFRONT_AZURE_KEY};EndpointSuffix=core.windows.net"

Result: 123 passed / 10 failed

Bug 1 — Non-superuser app role cold I/O fails on Azure (LocalFileSystem disabled)

File: extension/coldfront/coldfront--0.1.sql — ensure_attached() line 180

ensure_attached() is SECURITY DEFINER and pre-loads the iceberg extension inside the elevated context but never calls LOAD azure. On Azure, DuckDB auto-loads the azure extension on the first query — at that point the session is already running as the non-superuser japp, so pg_duckdb has LocalFileSystem disabled → extension file load fails. Works fine on S3/SeaweedFS because httpfs loads as a side-effect of the iceberg ATTACH with no further LocalFileSystem call.

ERROR: Permission Error: File system LocalFileSystem has been disabled by configuration
Failing: app role reads tiered hot+cold — expected '290', got 'SET' and app role cold write landed — expected '1', got '0'

Fix: detect storage type from coldfront.storage_secret inside ensure_attached() and call PERFORM duckdb.raw_query('LOAD azure') before returning, within the SECURITY DEFINER context.

Bug 2 — Concurrent mixed-tier writers get Conflict_409 on Azure (advisory lock + deferred POST race)

File: extension/coldfront/coldfront--0.1.sql — _exec_iceberg_with_claim() line 1964

The vanilla single-node path takes pg_advisory_xact_lock then calls duckdb.raw_query(). pg_duckdb's Iceberg commit POST is deferred to after the PostgreSQL transaction ends — which is also when the advisory lock is released. On SeaweedFS (local, sub-millisecond) the deferred POST fires before the next writer gets in. On Azure ADLS (network round-trip), the POST is slow enough that multiple writers release their locks and queue deferred POSTs simultaneously at Lakekeeper → all race on the same snapshot → Conflict_409. Same race surfaces as archiver errored during race window (phase 3 attempt 2, 409).

Failed to commit Iceberg transaction: Conflict_409, CatalogCommitConflicts
expected: 9099356468329532207, found: 562916686884392705
Failing: no concurrent mixed-tier writer errored — expected '0', got '12', 4 concurrent writers all landed — expected '8', got '2', archiver errored during race window

Bug 3 — Compactor fails on all backends (manifest format-version patch incomplete)

File: docker/iceberg-manifest-list-format-version-v15.patch

iceberg-manifest-list-format-version-v15.patch adds format-version to the manifest-list Avro file header metadata. However, iceberg-go (cmd/compactor) also validates the per-entry format-version field inside the manifest list records. The patch fixes the header but leaves the per-entry field at v1 — iceberg-go checks both and rejects on the mismatch. This is not Azure-specific; it fails on all backends. It was hidden because ci/topo/vanilla.sh only runs make build (archiver + partitioner) and never builds the compactor — stories 6d and 6e silently failed with No such file or directory in all prior runs. Running make compactor before the journey exposes the underlying patch bug.

compactor: plan files for default.events:
manifest file's 'format-version' metadata indicates version 2,
but entry from manifest list indicates version 1
Failing: compactor --dry-run found nothing to compact (6d), expire --dry-run shows nothing to expire (6e)

Also: ci/topo/vanilla.sh should run make compactor before the journey step so stories 6d/6e are never silently skipped again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions