ext4: fix extent tree splits for large fragmented files#412
ext4: fix extent tree splits for large fragmented files#412apinonformoso wants to merge 3 commits into
Conversation
deitch
left a comment
There was a problem hiding this comment.
Something I didn't understand
| } else { | ||
| // Write new leaf nodes to the disk using parent reference | ||
| err := writeNodeToDisk(firstLeaf, fs, parent) | ||
| // Non-root split: allocate disk blocks for the new leaf nodes. |
There was a problem hiding this comment.
This section of code looks identical to the root case above. How is it different?
There was a problem hiding this comment.
Right, they are identical now.
The fix was to make the non-root path do the same allocate + writeNodeToBlock as the root path, which ended as two duplicate blocks since I kept the original if/else structure, but it doesn't make sense to leave it like that.
For context, the else branch used writeNodeToDisk, assuming the parent could resolve where to write. That caused a lookup failure with block number not found for node because the new split nodes aren't in parent.children yet (the parent is updated later in extendLeafNode).
I'll collapse this into a single block.
deitch
left a comment
There was a problem hiding this comment.
This has less repeated code, which is better. thank you.
I see a logic concern. If I understood it correctly, by removing parent from splitLeafNode(), it no longer knows if this is root or not. Does this not mean that it always creates 2 new nodes? Let's say I am starting with the root, and it grows too big. At that point, it should create two new nodes in blocks (A and B), and the root points to A and B. Now an internal node gets too big, e.g. A. ext4 normally does not create two new nodes, C and D. Instead, it creates one new node (C), and root points to A, C, B. Instead, it looks like we add C and D, the root points to C, D, B and A is lost entirely.
The only time we do need more than 1 node (other than when outgrowing root) is when the parent of any level is full and we need internal nodes. E.g. root has 4 slots, and if we are already at A, B, C, D and need another slot, then (A E B C D), but 5 slots do not fit, so we create E and then internal I1 I2 and then repoint root -> I1 and root ->I1, etc. But that, I believe, is handled correctly.
|
Good catch, you're right that orphaning the original block isn't correct. To clarify the mechanics: The original node still carries its block in |
Summary
Fixes bugs in ext4 extent tree growth when leaf or internal nodes split during
extendExtentTree. These failures showed up when writing large, fragmented files whose extent trees grow beyond the inode-root limits, for example, copying large binaries from container images such asnode:22-slim.Previously:
writeNodeToDiskwithout allocated disk blocks, causing:slice bounds out of range) because the over-capacity root was never split.This change:
writeNodeToBlockwhen splitting non-root leaf nodes (consistent with the existing root-split path)splitInternalNodeChildrenwhen child count exceedsmaxRoot cause
writeNodeToDiskresolves a node's disk block by looking it up inparent.children. Nodes created during a split are not yet registered there, so the lookup fails. Root splits already avoided this by allocating blocks explicitly; non-root splits did not.When a leaf split replaced one child with two under the inode-root internal node, the root could exceed its 4-child limit without being split, producing an invalid tree.
Test plan
go test ./filesystem/ext4/ -run '^TestExtendExtentTree' -count=1fileBlock 340withblock number not found for node