[SPARK-57708][4.1][INFRA] Backport CI precompile artifact sharing and Coursier cache unification#56798
Conversation
… Coursier cache unification ### What changes were proposed in this pull request? This backports the CI build-time optimization series from `branch-4.x`/`master` to `branch-4.1`. A `precompile` job builds Spark once and publishes the compile output as an artifact that the downstream matrix jobs consume (falling back to a local build if the precompile job is absent or fails), the per-job Coursier caches are unified under a single key, and the shared compile artifacts use zstd compression. Squashed backport of: - [SPARK-56768] Share SBT compile artifact across pyspark CI jobs - [SPARK-56831] Share SBT precompile artifact with sparkr CI job - [SPARK-56943] Share SBT precompile artifact with JVM build matrix - [SPARK-56964] Share Maven precompile artifact across maven_test matrix - [SPARK-57069] Share SBT precompile artifact with docker/k8s integration test CI jobs - [SPARK-57075] Share precompile Coursier cache with host-runner SBT jobs - [SPARK-57142] Share SBT precompile artifact with tpcds-1g CI job - [SPARK-57144] Unify Coursier cache to a single key across all jobs - [SPARK-56830] Share SBT compile artifact with python hosted runner CI jobs - [SPARK-57330] Switch shared CI compile artifacts to zstd compression Adaptations for `branch-4.1`: the Python toolchain stays on 3.11 with the branch's existing package pins, and the GitHub Actions are kept at the versions already pinned on `branch-4.1` (`actions/cache@v4`, `actions/cache/restore@v4`, `actions/checkout@v4`, `actions/setup-java@v4`, `actions/download-artifact@v4`, `actions/upload-artifact@v4`) rather than pulling in the unrelated action version bumps. As on `branch-4.x`, the `precompile` job is the sole Coursier cache writer and all consumer jobs restore-only. ### Why are the changes needed? To cut redundant Scala/Maven compilation and Coursier cache duplication on `branch-4.1` CI, matching the optimization already present on the newer branches. ### Does this PR introduce _any_ user-facing change? No. CI-only. ### How was this patch tested? CI on this PR. The three workflow files validate with `python3 -c "import yaml; yaml.safe_load(...)"`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (claude-opus-4-8)
|
@zhengruifeng as the original author for this feature. I took a quick look and everything seems okay. We don't really use |
uros-b
left a comment
There was a problem hiding this comment.
Thank you @gaogaotiantian and @HyukjinKwon!
| uses: actions/cache@v4 | ||
| with: | ||
| path: ~/.cache/coursier | ||
| key: coursier-${{ runner.os }}-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }} |
There was a problem hiding this comment.
Minor note: the precompile writer Coursier key (coursier-${{ runner.os }}-) and the consumer build-job key (${{ runner.os }}-coursier-) no longer share a prefix, so the consumer would miss the precompile-warmed cache. Real drift from branch-4.x (which keeps coursier-${{ runner.os }}- on both). The workflow is macOS-only, so the writer never runs — but align to coursier-${{ runner.os }}- for consistency.
What changes were proposed in this pull request?
This backports the CI build-time optimization series from
branch-4.x/mastertobranch-4.1. Aprecompilejob builds Spark once and publishes the compile output as an artifact that the downstream matrix jobs consume (falling back to a local build if the precompile job is absent or fails), the per-job Coursier caches are unified under a single key, and the shared compile artifacts use zstd compression. Squashed backport of:Adaptations for
branch-4.1: the Python toolchain stays on 3.11 with the branch's existing package pins, and the GitHub Actions are kept at the versions already pinned onbranch-4.1(actions/cache@v4,actions/cache/restore@v4,actions/checkout@v4,actions/setup-java@v4,actions/download-artifact@v4,actions/upload-artifact@v4) rather than pulling in the unrelated action version bumps. As onbranch-4.x, theprecompilejob is the sole Coursier cache writer and all consumer jobs are restore-only.Why are the changes needed?
To cut redundant Scala/Maven compilation and Coursier cache duplication on
branch-4.1CI, matching the optimization already present on the newer branches.Does this PR introduce any user-facing change?
No. CI-only.
How was this patch tested?
CI on this PR. The three workflow files validate with
python3 -c "import yaml; yaml.safe_load(...)".Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-8)