[fix](fe) Disable row-store lazy fetch for shared base columns#62864
[fix](fe) Disable row-store lazy fetch for shared base columns#62864englefly wants to merge 1 commit intoapache:masterfrom
Conversation
### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: Prevent topn lazy materialization from using row-store fetch when multiple lazy slots share the same base column, such as variant root/subcolumn accesses. This avoids wrong results in the non-two-phase row-store path. ### Release note None ### Check List (For Author) - Test: No need to test (waiting for manual confirmation) - Behavior changed: Yes (lazy materialization now falls back from row-store fetch to column fetch when lazy slots share a base column) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found 2 blocking issues.
PhysicalPlanTranslator.canUseRowStoreForLazySlots()still allows row-store fetch for a single lazy variant subcolumn. BE row-store decoding keys only bycol_unique_idand ignorescolumn_paths(), so a plan that lazily materializes onlykv[ssl]still returns the variant root instead of the subcolumn.- The legacy defer-materialize / two-phase-read path is still unfixed. Changing
enableTwoPhaseReadOpttofalseonly hides the problem for the default session; if a session explicitly enables it and reachesDeferMaterializeTopNResult,OlapTable.generateTwoPhaseReadOption()still turns on row-store fetch without any slot-level safeguard.
Critical checkpoints:
- Goal / correctness: Partially achieved. The duplicate-base-column case in
PhysicalLazyMaterializeis covered, but the row-store correctness hole is not closed end to end. - Scope / minimality: The translator change is small, but the session-variable default flip broadens behavior without fixing the legacy path.
- Concurrency: No new concurrency or locking issues found.
- Lifecycle / static init: No special lifecycle issues found.
- Config behavior: Existing config
enable_two_phase_read_optchanges default behavior, but the underlying legacy path remains reachable when explicitly enabled. - Parallel paths: Not complete. The same safeguard is still missing from the legacy
DeferMaterializeTopNResult/generateTwoPhaseReadOption()path. - Tests: Added helper-level unit coverage only. Missing coverage for an actual lazy variant subcolumn plan and for the legacy path.
- Test results / persistence / writes / observability / FE-BE variable passing: No additional issues found here.
User focus: no additional review focus was provided.
| List<Boolean> rowStoreFlags = new ArrayList<>(); | ||
| for (Relation relation : materialize.getRelations()) { | ||
| rowStoreFlags.add(shouldUseRowStore(relation)); | ||
| rowStoreFlags.add(shouldUseRowStore(relation, materialize.getLazySlots(relation))); |
There was a problem hiding this comment.
This is still not sufficient for variant subcolumns. LogicalOlapScan builds kv[ssl] as a SlotReference with the root column unique id plus a non-empty subpath, so canUseRowStoreForLazySlots() still returns true when there is only one such lazy slot. But the BE row-store path (RowIdStorageReader::read_batch_doris_format_row() -> JsonbSerializeUtil::jsonb_to_block()) decodes only by col_unique_id and never consults column_paths(), so it materializes the whole variant root instead of the requested subcolumn. We need to disable row-store whenever any lazy slot has a subpath, not only when multiple slots share the same base column.
| // 2. spawn fetch RPC to other nodes to get related data by sorted rowids | ||
| @VarAttrDef.VarAttr(name = ENABLE_TWO_PHASE_READ_OPT, fuzzy = true) | ||
| public boolean enableTwoPhaseReadOpt = true; | ||
| public boolean enableTwoPhaseReadOpt = false; |
There was a problem hiding this comment.
Changing the default here only hides the bug for the default session; it does not fix the remaining legacy path. If a session explicitly enables enable_two_phase_read_opt and reaches DeferMaterializeTopNResult, visitPhysicalDeferMaterializeResultSink() still uses OlapTable.generateTwoPhaseReadOption(), which unconditionally enables row-store fetch for row-store tables. The same wrong-result case is still reachable there, so this needs the same slot-level guard in the legacy path instead of only a default flip.
FE UT Coverage ReportIncrement line coverage |
| return false; | ||
| } | ||
| Optional<Column> originalColumn = ((SlotReference) lazySlot).getOriginalColumn(); | ||
| if (!originalColumn.isPresent() || !originalColumnUniqueIds.add(originalColumn.get().getUniqueId())) { |
There was a problem hiding this comment.
what's !originalColumnUniqueIds.add(originalColumn.get().getUniqueId()) mean? maybe need add comment to explain it
| // 2. spawn fetch RPC to other nodes to get related data by sorted rowids | ||
| @VarAttrDef.VarAttr(name = ENABLE_TWO_PHASE_READ_OPT, fuzzy = true) | ||
| public boolean enableTwoPhaseReadOpt = true; | ||
| public boolean enableTwoPhaseReadOpt = false; |
There was a problem hiding this comment.
wahy change the default value?
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Prevent topn lazy materialization from using row-store fetch when multiple lazy slots share the same base column, such as variant root/subcolumn accesses. This avoids wrong results in the non-two-phase row-store path.
Release note
None
Check List (For Author)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)