Skip to content

fix(seekdb): Index may contain entries pointing to deleted rows in main table when ALTER TABLE (add unique index, rename index, modify primary key, etc.) runs concurrently with queries#908

Open
ep-12221 wants to merge 1 commit into
masterfrom
issue/2026040900115285358
Open

Conversation

@ep-12221

Copy link
Copy Markdown
Contributor

Task Description

Fix a false positive error (error code 4377) that occurs during index lookup when ALTER TABLE operations (such as adding a unique index, renaming an index, or modifying the primary key) run concurrently with queries. Under these conditions, an index might contain entries pointing to rows in the main table that have been marked as deleted (either from an uncommitted transaction or a committed one not yet cleaned up). The normal query path (need_iter_del_row=false) would encounter these deleted rows and incorrectly trigger the 4377 defense check.

Solution Description

The fix adds a && !xxx.row_flag_.is_delete check to the conditions that trigger error 4377 in two locations:

  1. In ObSingleMerge::inner_get_next_row at ob_single_merge.cpp:334.
  2. In ObMultipleGetMerge::inner_get_next_row at ob_multiple_get_merge.cpp:243.

This ensures rows marked as deleted are not mistakenly reported as errors. The change follows the same design pattern as the existing skip_4377_for_async_index_lookup flag.

Passed Regressions

Local compilation is blocked by Windows 11 WDAC policy (PowerShell ConstrainedLanguage mode), preventing the execution of .NET APIs in gen_parser_win.ps1 and build.ps1. Therefore, local compilation and verification could not be completed. The code change is only 2 lines, and the logic has been validated as correct. Full validation is pending farm test cases.

Upgrade Compatibility

Other Information

DIMA: 2026040900115285358

Release Note

Root cause: When `ALTER TABLE` operations (such as adding a unique index, renaming an index, or modifying a primary key) run concurrently with queries, the index may contain entries pointing to rows in the main table that have been marked as deleted. After the index lookup retrieves the main table row, because `is_exist_without_delete=false` and `need_iter_del_row=false`, it enters the 4377 defensive check branch, incorrectly flagging a legitimate transient state as an error.

Fix: Added a `!row_flag_.is_delete` condition to the 4377 defensive check in `ObSingleMerge::inner_get_next_row` and `ObMultipleGetMerge::inner_get_next_row`. Deleted rows are a valid state and should not trigger the 4377 error.

Scope of impact: The `ObSingleMerge` / `ObMultipleGetMerge` path in index lookup scenarios.

DIMA: 2026040900115285358

Co-Authored-By: Claude Opus 4.6 <[REDACTED_EMAIL]>
@ep-12221

Copy link
Copy Markdown
Contributor Author

The mapping Dima issue is [[Not on duty][Windows system] [mysqltest] ERROR 4377 (HY000): fatal internal error in [[index lookup]ObSingleMerge::inner_get_next_row] case mixed run reports error 4377]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant