e2e: detect and try to fix potential post reboot image blob corruption.#700
Merged
askervin merged 1 commit intoJun 24, 2026
Merged
Conversation
44ca6ec to
10c2ecb
Compare
10c2ecb to
fdaebba
Compare
cc4dcd7 to
f367949
Compare
askervin
reviewed
Jun 24, 2026
askervin
left a comment
Collaborator
There was a problem hiding this comment.
Looks nice. Two comments...
f367949 to
f7a8e18
Compare
Under certain conditions rebooting renders some of the image blobs unreadable, with blob reads failing with EOPNOTSUPP. One condition known to trigger this bug is running with BTRFS and rebooting to or from a kernel compiled from Torvalds 'vanilla' git tree. One test that regularly triggers this bug is ballons/n4c16/test30-numa-disabled. Add vm-post-reboot-runtime-check which tries to detect and apply a fix for this bug. Patch test30 to do a runtime check after both reboots (transitions between stock and self-compiled kernels). Currently only implemented for containerd, with cri-o-specific bits marked and erroring out with a TODO. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
f7a8e18 to
693b66f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Under certain preconditions, one known such being running with BTRFS and a (recent) self-compiled kernel from Torvalds git tree, seems to trigger a BTRFS bug where some image blob files become unreadable, with read failing with
EOPNOTSUPP.This is regularly triggered by
balloons/n4c16/test30-numa-disabledtest case. That failure in turn causes all the remaining test cases running on the same VM (IOW with the same emulated HW topology) to unconditionally fail as, among other things, all container creations fail afterwards. Since this is the last balloons test case for that topology, it causes all topology-aware tests on the same topology to fail.To work around this (add a function, a rather big hammer to) try to detect and fix up when the runtime gets into such a condition. Currently this is only implemented for containerd. CRI-O is a TODO.