-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[fix](fe) Disable row-store lazy fetch for shared base columns #62864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2804,7 +2804,7 @@ public PlanFragment visitPhysicalLazyMaterialize(PhysicalLazyMaterialize<? exten | |
|
|
||
| List<Boolean> rowStoreFlags = new ArrayList<>(); | ||
| for (Relation relation : materialize.getRelations()) { | ||
| rowStoreFlags.add(shouldUseRowStore(relation)); | ||
| rowStoreFlags.add(shouldUseRowStore(relation, materialize.getLazySlots(relation))); | ||
| } | ||
| materializeNode.setRowStoreFlags(rowStoreFlags); | ||
|
|
||
|
|
@@ -2817,14 +2817,28 @@ public PlanFragment visitPhysicalLazyMaterialize(PhysicalLazyMaterialize<? exten | |
| return inputPlanFragment; | ||
| } | ||
|
|
||
| private boolean shouldUseRowStore(Relation rel) { | ||
| static boolean canUseRowStoreForLazySlots(List<Slot> lazySlots) { | ||
| Set<Integer> originalColumnUniqueIds = new HashSet<>(); | ||
| for (Slot lazySlot : lazySlots) { | ||
| if (!(lazySlot instanceof SlotReference)) { | ||
| return false; | ||
| } | ||
| Optional<Column> originalColumn = ((SlotReference) lazySlot).getOriginalColumn(); | ||
| if (!originalColumn.isPresent() || !originalColumnUniqueIds.add(originalColumn.get().getUniqueId())) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what's
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add 函数返回true 就表示这个元素已经在集合里了 |
||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } | ||
|
|
||
| private boolean shouldUseRowStore(Relation rel, List<Slot> lazySlots) { | ||
| boolean useRowStore = false; | ||
| if (rel instanceof PhysicalOlapScan) { | ||
| OlapTable olapTable = ((PhysicalOlapScan) rel).getTable(); | ||
| useRowStore = olapTable.storeRowColumn() | ||
| && CollectionUtils.isEmpty(olapTable.getTableProperty().getCopiedRowStoreColumns()); | ||
| } | ||
| return useRowStore; | ||
| return useRowStore && canUseRowStoreForLazySlots(lazySlots); | ||
| } | ||
|
|
||
| @Override | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2194,7 +2194,7 @@ public boolean isEnableHboNonStrictMatchingMode() { | |
| // 1. read related rowids along with necessary column data | ||
| // 2. spawn fetch RPC to other nodes to get related data by sorted rowids | ||
| @VarAttrDef.VarAttr(name = ENABLE_TWO_PHASE_READ_OPT, fuzzy = true) | ||
| public boolean enableTwoPhaseReadOpt = true; | ||
| public boolean enableTwoPhaseReadOpt = false; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changing the default here only hides the bug for the default session; it does not fix the remaining legacy path. If a session explicitly enables
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wahy change the default value?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. deferMaterialize已经被topn lazy materialize 替代。这个bug 在 两种materialize里都存在。我只修了topn lazy,后面会将deferMaterilize删掉,所以就没有修defer的部分。用户关掉topn lazy后 如果enableTwoPhaseReadOpt是true,那么就可能走入 deferMaterialize的bug |
||
| @VarAttrDef.VarAttr(name = TOPN_OPT_LIMIT_THRESHOLD) | ||
| public long topnOptLimitThreshold = 1024; | ||
| @VarAttrDef.VarAttr(name = TOPN_FILTER_RATIO) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still not sufficient for variant subcolumns.
LogicalOlapScanbuildskv[ssl]as aSlotReferencewith the root column unique id plus a non-empty subpath, socanUseRowStoreForLazySlots()still returns true when there is only one such lazy slot. But the BE row-store path (RowIdStorageReader::read_batch_doris_format_row()->JsonbSerializeUtil::jsonb_to_block()) decodes only bycol_unique_idand never consultscolumn_paths(), so it materializes the whole variant root instead of the requested subcolumn. We need to disable row-store whenever any lazy slot has a subpath, not only when multiple slots share the same base column.