feat: add tarball overlays for source archive modification#224
feat: add tarball overlays for source archive modification#224Tonisal-byte wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds support for “tarball overlays” that modify files inside source tarballs (extract/modify/repack) and updates sources preparation to account for in-place tarball changes.
Changes:
- Introduces a
tarballutility package for compression detection, extraction, and deterministic repacking. - Adds new overlay types (
tarball-file-remove,tarball-search-replace,tarball-patch) plus grouping/application logic in source prep. - Updates
sourcesfile update flow to rehash tarballs modified by overlays and extends docs/tests accordingly.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/utils/tarball/tarball.go | Implements compression detection, extraction, and deterministic repacking. |
| internal/utils/tarball/tarball_test.go | Adds tests for compression detection, extract-root resolution, and deterministic repack. |
| internal/projectconfig/overlay.go | Adds tarball overlay types, validation, and ModifiesTarball(). |
| internal/projectconfig/overlay_test.go | Extends validation/modification classification tests for tarball overlays. |
| internal/app/azldev/core/sources/tarballoverlays.go | Implements tarball overlay grouping, extract/modify/repack pipeline, and patch application. |
| internal/app/azldev/core/sources/tarballoverlays_internal_test.go | Adds unit tests for grouping and tarball overlay operations. |
| internal/app/azldev/core/sources/sourceprep.go | Applies tarball overlays first and rehashes modified tarballs when updating sources. |
| internal/app/azldev/core/sources/sourceprep_test.go | Updates expected error text for sources parsing failures. |
| internal/app/azldev/cmds/component/preparesources.go | Refactors option wiring into helper to reduce cyclomatic complexity. |
| docs/user/reference/config/overlays.md | Documents new tarball overlays and related fields/behavior. |
|
|
||
| // globFilesInDir finds files under root matching a glob pattern. | ||
| // Supports doublestar patterns (e.g., "**/*.md"). | ||
| func globFilesInDir(root, pattern string) ([]string, error) { |
There was a problem hiding this comment.
Do we not already have helpers to do this?
a495c4d to
3eb0c57
Compare
ad0e582 to
3a1883c
Compare
|
|
||
| ### Archive Overlays | ||
|
|
||
| A `file-remove` or `file-search-replace` overlay can modify files **inside** a source archive |
There was a problem hiding this comment.
Does this support removing/editing files in nested archives? I'm thinking about cases where a.tar.gz contains a/internal/archive.tar.gz and you want to remove a.tar.gz/a/internal/archive.tar.gz/nested/dir/file-to-remove.txt.
If it doesn't, I don't think it's a P0 (I believe all 10 scripts can be replaced without the nested archives supported), but with the new archive field required for differentiating between "regular" files and archives, adding this extension in the future may be tricky.
Considering that, do you think it would be possible to use the file path alone to denote the file to be removed/edited and automatically detect, if a file on the path is a directory vs an archive as in the a.tar.gz/a/internal/archive.tar.gz/nested/dir/file-to-remove.txt example? Starting with this would give us support for nested archives in one go.
PS Cool idea with re-using existing overlays.
There was a problem hiding this comment.
Thanks! The idea was originally Reubens but I'm glad the implementation is sound. I think it would be better if we revisited this nested archive format in the future. I do agree it's important, but it might introduce complexity to the overlay application process
0d6a73d to
073e58f
Compare
073e58f to
96d9a9c
Compare
0a60b3e to
2a8345b
Compare
Add three new overlay types (tarball-file-remove, tarball-search-replace, tarball-patch) that modify files inside source tarballs during source preparation. Operations are performed in pure Go on the host. Includes: - internal/utils/tarball: reusable deterministic tar extract/repack library - Overlay type registration, validation, and fingerprinting - Source prep integration with sources file hash rehashing - User documentation and TOML examples
2a8345b to
cb8f671
Compare
cb8f671 to
19be590
Compare
| t.Run("deletes matching files in the extracted tree", func(t *testing.T) { | ||
| extractRoot := t.TempDir() | ||
| require.NoError(t, os.WriteFile(extractRoot+"/keep.txt", []byte("keep"), fileperms.PrivateFile)) | ||
| require.NoError(t, os.WriteFile(extractRoot+"/remove.conf", []byte("x"), fileperms.PrivateFile)) | ||
|
|
||
| extractFS, err := rootfs.New(extractRoot) | ||
| require.NoError(t, err) | ||
|
|
||
| defer extractFS.Close() | ||
|
|
||
| overlay := projectconfig.ComponentOverlay{ | ||
| Type: projectconfig.ComponentOverlayRemoveFile, | ||
| Filename: "*.conf", | ||
| } | ||
|
|
||
| err = applyNonSpecOverlay(ctx, ctx.FS(), extractFS, overlay) | ||
| require.NoError(t, err) | ||
|
|
||
| assert.FileExists(t, extractRoot+"/keep.txt") | ||
| assert.NoFileExists(t, extractRoot+"/remove.conf") | ||
| }) |
There was a problem hiding this comment.
I don't get this test. Does it test anything introduced in archiveoverlays.go? I only see a call to applyNonSpecOverlay, which existed for a while and is already tested elsewhere (I hope).
| // TestArchiveFileRemove verifies that archive-scoped file-remove overlays are | ||
| // routed through the shared [applyNonSpecOverlay] machinery against the | ||
| // extract-root FS (i.e., the same code path that [processArchive] uses). |
There was a problem hiding this comment.
- Why do we test if we're re-using
applyNonSpecOverlay? Seems like internal implementation detail and I'm not sure, it's a specific requirement we want to impose. Re-using existing code is definitely a good thing, though, I'm not questioning that. - If we do want to check for this specific case, however, the name
TestArchiveFileRemoveis too broad and suggests we're testing the removal of files inside archives, not re-use ofapplyNonSpecOverlay. - I don't think any of the tests in this function actually check we're calling
applyNonSpecOverlayfrom the archive overlays logic.
| ```toml | ||
| [[components.mypackage.overlays]] | ||
| type = "file-remove" | ||
| file = "mypackage-1.0.tar.gz/vendor/**" |
There was a problem hiding this comment.
I couldn't find tests for globs to make sure the following cases are treated correctly:
<archive>/some_folder- removes the wholesome_folderfrom inside the archive.<archive>/some_folder/*- removes the contents ofsome_folderinside the archive but keeps the folder itself.<archive>/some_folder/**- same as above.<archive>/*/some_file.txt- removessome_file.txtfrom TOP LEVEL folders inside the archive.<archive>/**/some_file.txt- removessome_file.txtfrom ALL folders inside the archive regardless of depth.- Repetition of the cases above, but instead of
<archive>we have a local checked-in directory.
Adds tarball overlay support for modifying source archives by applying file overlays during repack.
This is part 2 of 2 in the tarball overlay feature stack.