[Bug][Iceberg] IcebergCommitCallback emits "overwrite" for compaction commits; should emit "replace" per Iceberg spec

### Search before asking

- [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar.


### Paimon version

1.4

### Compute Engine

Flink (Paimon Sink). Affects any engine using Paimon's IcebergCommitCallback (StarRocks in our case)



### Minimal reproduce step

1. Create an append-only Paimon table with the Iceberg metadata committer enabled (e.g. `'metadata.iceberg.storage' = 'rest-catalog'` pointed at a REST catalog such as Polaris, or `hadoop-catalog`).
2. Stream data in so that Paimon's LSM engine performs its normal level compaction (any long-running streaming ingest will do this within minutes).
3. After a compaction happens, read the Iceberg metadata for the resulting snapshot:
   ```bash
   gcloud storage cat gs://<warehouse>/<db>/<table>/metadata/v<N>.metadata.json \
     | jq '.snapshots[-5:] | .[] | {id:."snapshot-id", op:.summary.operation, added:.summary["added-records"], deleted:.summary["deleted-records"]}'
   ```
4. Observe that the compaction snapshot is labeled `"operation": "overwrite"` even though no logical rows were added or deleted (`added-records == 0`, `deleted-records == 0`; only files were reorganized).


### What doesn't meet your expectations?

Per the [Iceberg spec](https://iceberg.apache.org/spec/#snapshots), the four snapshot operation values have distinct semantics:
| operation   | Meaning |
| ----------- | ------- |
| `append`    | Only new data files added. |
| `replace`   | Files added and removed **without changing table data** (compaction, format change, relocation). |
| `overwrite` | Files added and removed **and table data may have changed** (`INSERT OVERWRITE`, `MERGE`, row-level deletes). |
| `delete`    | Only files removed. |

Paimon's own LSM compaction is by definition a pure file rewrite with no logical row change — this is exactly what Iceberg's `replace` operation is for. Native Iceberg writers (`RewriteFiles`, `RewriteManifests`) use `DataOperations.REPLACE` for this case, and all of Iceberg's incremental scan APIs (`IncrementalAppendScan`, `IncrementalChangelogScan`, Spark `MicroBatchStream`, Flink `MonitorSource`) treat `replace` as a no-op for incremental reads.
Paimon currently emits `overwrite` for these compaction snapshots, which is indistinguishable — from a downstream reader's point of view — from a genuine row-changing overwrite. This breaks any downstream consumer that relies on the spec's distinction.


### Anything else?

### Root cause
`IcebergSnapshotSummary` only defines two constants, and there is no code path in Paimon that produces `"replace"`:
```java
// paimon-core/src/main/java/org/apache/paimon/iceberg/metadata/IcebergSnapshotSummary.java
public static final IcebergSnapshotSummary APPEND    = new IcebergSnapshotSummary("append");
public static final IcebergSnapshotSummary OVERWRITE = new IcebergSnapshotSummary("overwrite");
```
`IcebergCommitCallback` runs after every Paimon commit (both `CommitKind.APPEND` and `CommitKind.COMPACT`). It does not inspect the Paimon `CommitKind`; it just diffs files and falls back to `OVERWRITE` any time a previously-manifested file was removed:
```java
// paimon-core/src/main/java/org/apache/paimon/iceberg/IcebergCommitCallback.java
// (createWithDeleteManifestFileMetas)
} else {
    // some file is removed, rewrite this file meta
    snapshotSummary = IcebergSnapshotSummary.OVERWRITE;
    ...
}
```
Compaction — which always removes the old L0/L1/... files and adds the merged result — therefore deterministically lands as `overwrite` rather than `replace`.
### Downstream impact [Starrocks Example]
StarRocks IVM (Incremental Materialized View) refresh on a Paimon-produced Iceberg table fails on every compaction snapshot with:
```
com.starrocks.sql.analyzer.SemanticException: Getting analyzing error.
Detail message: TvrTableDeltaTrait is not append-only for base table: <db>.<table>,
delta:DeltaTrait{delta=Delta@[<snap>,<snap>], changeType=RETRACTABLE,
stats=Stats{addedRows=0, addedFileSize=0}}.
```
StarRocks recently fixed this for native-Iceberg tables in [StarRocks#69825](https://github.com/StarRocks/starrocks/pull/69825), which skips `replace` snapshots in `IcebergMetadata.listTableDeltaTraits()`. That fix does not apply to Paimon-written Iceberg tables because Paimon never emits `replace`. The StarRocks PR author explicitly scoped the fix to Iceberg and noted that Paimon would need a separate change, so the cleanest place for it is upstream in Paimon, where the Iceberg semantics can be made to match the spec.
Related context:
- Iceberg spec — snapshots: https://iceberg.apache.org/spec/#snapshots
- StarRocks fix for native Iceberg: https://github.com/StarRocks/starrocks/pull/69825
- StarRocks feature request that prompted it: https://github.com/StarRocks/starrocks/issues/69493
- Existing Paimon gap in Iceberg snapshot metadata (related, not a duplicate): https://github.com/apache/paimon/issues/5860
### Anything else?
**Proposed fix**
1. Add a `REPLACE` constant to `IcebergSnapshotSummary`:
   ```java
   public static final IcebergSnapshotSummary REPLACE = new IcebergSnapshotSummary("replace");
   ```
2. In `IcebergCommitCallback`, thread the Paimon `CommitKind` (or the logical "rows unchanged" signal) through to the summary decision. When the underlying Paimon commit is `CommitKind.COMPACT` — or, equivalently, when the file-level diff adds/removes files but contributes zero net rows — emit `REPLACE` instead of `OVERWRITE`.
3. Keep `OVERWRITE` for genuine row-changing operations (`INSERT OVERWRITE`, merge-on-read deletes that actually drop logical rows, etc.).
This aligns Paimon's Iceberg-compat metadata with the Iceberg spec and lets downstream incremental readers (StarRocks IVM, Spark structured streaming incremental scans, Flink Iceberg source, etc.) correctly treat Paimon compaction as a no-op for incremental refresh.
Happy to send a PR if a maintainer can confirm the proposed shape (new enum constant + `CommitKind`-based branch) is acceptable.


### Are you willing to submit a PR?

- [ ] I'm willing to submit a PR!

operation	Meaning
`append`	Only new data files added.
`replace`	Files added and removed without changing table data (compaction, format change, relocation).
`overwrite`	Files added and removed and table data may have changed (`INSERT OVERWRITE`, `MERGE`, row-level deletes).
`delete`	Only files removed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug][Iceberg] IcebergCommitCallback emits "overwrite" for compaction commits; should emit "replace" per Iceberg spec #7683

Search before asking

Paimon version

Compute Engine

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Root cause

Downstream impact [Starrocks Example]

Anything else?

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug][Iceberg] IcebergCommitCallback emits "overwrite" for compaction commits; should emit "replace" per Iceberg spec #7683

Description

Search before asking

Paimon version

Compute Engine

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Root cause

Downstream impact [Starrocks Example]

Anything else?

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions