Skip to content

Silent data loss when UPDATEing cold row's partition key (ts) to a hot timestamp range #20

@imtiazqa

Description

@imtiazqa

When a user UPDATEs a cold (Iceberg) row's ts column to a timestamp that falls within the hot (PostgreSQL) range, the row silently disappears — it is neither in the hot table nor visible via the unified VIEW after the UPDATE succeeds.

Steps to reproduce:

Setup

CREATE TABLE coldfront.events (
    id BIGSERIAL,
    ts TIMESTAMPTZ NOT NULL,
    status TEXT,
    payload TEXT
) PARTITION BY RANGE (ts);

ALTER TABLE coldfront.events ADD PRIMARY KEY (id, ts);

-- Cold partition (Jan — older than 3 months, will be archived)
CREATE TABLE coldfront.events_p_2026_01
    PARTITION OF coldfront.events
    FOR VALUES FROM ('2026-01-01 00:00:00+00') TO ('2026-02-01 00:00:00+00');

-- Hot partition (Jun — within hot_period, stays in PG)
CREATE TABLE coldfront.events_p_2026_06
    PARTITION OF coldfront.events
    FOR VALUES FROM ('2026-06-01 00:00:00+00') TO ('2026-07-01 00:00:00+00');

INSERT INTO coldfront.events (ts, status, payload) VALUES
  ('2026-01-10 10:00:00+00', 'open', 'cold-row-1'),
  ('2026-01-15 14:00:00+00', 'open', 'cold-row-2'),
  ('2026-01-20 09:00:00+00', 'open', 'cold-row-3');

INSERT INTO coldfront.events (ts, status, payload) VALUES
  ('2026-06-10 10:00:00+00', 'open', 'hot-row-1'),
  ('2026-06-12 14:00:00+00', 'open', 'hot-row-2'),
  ('2026-06-15 09:00:00+00', 'open', 'hot-row-3');
\q

Register + Archive

./archiver register
--config /tmp/cf-gcs-db.yaml
--schema coldfront
--table events
--period monthly
--hot-period "3 months"

./archiver --config /tmp/cf-gcs-db.yaml run

Verify both tiers before UPDATE

psql -h localhost -U coldfront -d coldfront

-- Confirm events=VIEW, _events=hot table
SELECT relname, relkind FROM pg_class
WHERE relnamespace = 'coldfront'::regnamespace
AND relname IN ('events', '_events');

-- Confirm 3 hot rows only in _events
SELECT COUNT(*) FROM coldfront._events;

-- Confirm 6 rows total via unified VIEW
SELECT ts, status, payload FROM coldfront.events ORDER BY ts;

Reproduce the bug

-- Step 1: confirm cold-row-2 exists in cold tier
SELECT ts, status, payload FROM coldfront.events WHERE payload = 'cold-row-2';

-- Step 2: UPDATE cold row's ts to a hot-range timestamp
UPDATE coldfront.events SET ts = '2026-06-18 10:00:00+00' WHERE payload = 'cold-row-2';

-- Step 3: verify row is gone
SELECT ts, status, payload FROM coldfront.events WHERE payload = 'cold-row-2';
SELECT COUNT(*) FROM coldfront._events WHERE payload = 'cold-row-2';

-- Should show the row if it moved to hot
SELECT * FROM coldfront._events WHERE payload = 'cold-row-2';

-- Should show the row if it stayed in cold
SELECT * FROM coldfront.events WHERE payload = 'cold-row-2';

-- Total row count — should be 6, will be 5 if row is lost
SELECT COUNT(*) FROM coldfront.events;
 id | ts | status | payload 
----+----+--------+---------
(0 rows)

 id | ts | status | payload 
----+----+--------+---------
(0 rows)

 count 
-------
     5
(1 row)

coldfront=# 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions