Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2804,7 +2804,7 @@ public PlanFragment visitPhysicalLazyMaterialize(PhysicalLazyMaterialize<? exten

List<Boolean> rowStoreFlags = new ArrayList<>();
for (Relation relation : materialize.getRelations()) {
rowStoreFlags.add(shouldUseRowStore(relation));
rowStoreFlags.add(shouldUseRowStore(relation, materialize.getLazySlots(relation)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not sufficient for variant subcolumns. LogicalOlapScan builds kv[ssl] as a SlotReference with the root column unique id plus a non-empty subpath, so canUseRowStoreForLazySlots() still returns true when there is only one such lazy slot. But the BE row-store path (RowIdStorageReader::read_batch_doris_format_row() -> JsonbSerializeUtil::jsonb_to_block()) decodes only by col_unique_id and never consults column_paths(), so it materializes the whole variant root instead of the requested subcolumn. We need to disable row-store whenever any lazy slot has a subpath, not only when multiple slots share the same base column.

}
materializeNode.setRowStoreFlags(rowStoreFlags);

Expand All @@ -2817,14 +2817,28 @@ public PlanFragment visitPhysicalLazyMaterialize(PhysicalLazyMaterialize<? exten
return inputPlanFragment;
}

private boolean shouldUseRowStore(Relation rel) {
static boolean canUseRowStoreForLazySlots(List<Slot> lazySlots) {
Set<Integer> originalColumnUniqueIds = new HashSet<>();
for (Slot lazySlot : lazySlots) {
if (!(lazySlot instanceof SlotReference)) {
return false;
}
Optional<Column> originalColumn = ((SlotReference) lazySlot).getOriginalColumn();
if (!originalColumn.isPresent() || !originalColumnUniqueIds.add(originalColumn.get().getUniqueId())) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's !originalColumnUniqueIds.add(originalColumn.get().getUniqueId()) mean? maybe need add comment to explain it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add 函数返回true 就表示这个元素已经在集合里了

return false;
}
}
return true;
}

private boolean shouldUseRowStore(Relation rel, List<Slot> lazySlots) {
boolean useRowStore = false;
if (rel instanceof PhysicalOlapScan) {
OlapTable olapTable = ((PhysicalOlapScan) rel).getTable();
useRowStore = olapTable.storeRowColumn()
&& CollectionUtils.isEmpty(olapTable.getTableProperty().getCopiedRowStoreColumns());
}
return useRowStore;
return useRowStore && canUseRowStoreForLazySlots(lazySlots);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -294,4 +294,8 @@ public List<List<Integer>> getLazyBaseColumnIndices() {
public List<Slot> getRowIds() {
return rowIdList;
}

public List<Slot> getLazySlots(Relation relation) {
return relationToLazySlotMap.getOrDefault(relation, ImmutableList.of());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2194,7 +2194,7 @@ public boolean isEnableHboNonStrictMatchingMode() {
// 1. read related rowids along with necessary column data
// 2. spawn fetch RPC to other nodes to get related data by sorted rowids
@VarAttrDef.VarAttr(name = ENABLE_TWO_PHASE_READ_OPT, fuzzy = true)
public boolean enableTwoPhaseReadOpt = true;
public boolean enableTwoPhaseReadOpt = false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the default here only hides the bug for the default session; it does not fix the remaining legacy path. If a session explicitly enables enable_two_phase_read_opt and reaches DeferMaterializeTopNResult, visitPhysicalDeferMaterializeResultSink() still uses OlapTable.generateTwoPhaseReadOption(), which unconditionally enables row-store fetch for row-store tables. The same wrong-result case is still reachable there, so this needs the same slot-level guard in the legacy path instead of only a default flip.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wahy change the default value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deferMaterialize已经被topn lazy materialize 替代。这个bug 在 两种materialize里都存在。我只修了topn lazy,后面会将deferMaterilize删掉,所以就没有修defer的部分。用户关掉topn lazy后 如果enableTwoPhaseReadOpt是true,那么就可能走入 deferMaterialize的bug

@VarAttrDef.VarAttr(name = TOPN_OPT_LIMIT_THRESHOLD)
public long topnOptLimitThreshold = 1024;
@VarAttrDef.VarAttr(name = TOPN_FILTER_RATIO)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -166,4 +166,30 @@ public void testRepeatInputOutputOrder() throws Exception {
}
);
}

@Test
public void testCanUseRowStoreForLazySlots() {
Column distinctA = new Column("a", org.apache.doris.catalog.Type.INT);
distinctA.setUniqueId(1);
Column distinctB = new Column("b", org.apache.doris.catalog.Type.INT);
distinctB.setUniqueId(2);
Column sharedVariant = new Column("kv", org.apache.doris.catalog.Type.VARIANT);
sharedVariant.setUniqueId(3);

SlotReference distinctSlotA = new SlotReference(StatementScopeIdGenerator.newExprId(), "a",
IntegerType.INSTANCE, true, ImmutableList.of(), null, distinctA, null, distinctA);
SlotReference distinctSlotB = new SlotReference(StatementScopeIdGenerator.newExprId(), "b",
IntegerType.INSTANCE, true, ImmutableList.of(), null, distinctB, null, distinctB);
SlotReference variantRootSlot = new SlotReference(StatementScopeIdGenerator.newExprId(), "kv",
org.apache.doris.nereids.types.VariantType.INSTANCE, true, ImmutableList.of(),
null, sharedVariant, null, sharedVariant);
SlotReference variantSubColumnSlot = new SlotReference(StatementScopeIdGenerator.newExprId(), "kv",
org.apache.doris.nereids.types.VariantType.INSTANCE, true, ImmutableList.of(),
null, sharedVariant, null, sharedVariant, ImmutableList.of("ssl"));

Assertions.assertTrue(PhysicalPlanTranslator.canUseRowStoreForLazySlots(
ImmutableList.of(distinctSlotA, distinctSlotB)));
Assertions.assertFalse(PhysicalPlanTranslator.canUseRowStoreForLazySlots(
ImmutableList.of(variantRootSlot, variantSubColumnSlot)));
}
}
Loading