chore: explore parallel library generation#12820
chore: explore parallel library generation#12820diegomarquezp wants to merge 10 commits intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request enables parallel library generation for the Java SDK by implementing a ThreadPoolExecutor and a ThreadLocalStream for log management, alongside file locking and unique temporary directories to ensure thread safety. Review feedback points out that the log-capturing mechanism is insufficient for subprocesses and lacks necessary file-like attributes, potentially causing runtime failures. Additionally, improvements were suggested regarding PEP 8 compliance, reducing module coupling, and allowing configurable thread pool limits.
|
|
||
| def library_generation_worker(config, library_path, library, repo_config): | ||
| buffer = StringIO() | ||
| has_local = hasattr(sys.stdout, "local") |
There was a problem hiding this comment.
The check hasattr(sys.stdout, "local") introduces a tight coupling between this module and the specific implementation of sys.stdout in generate_repo.py. This makes the function less reusable and harder to test in isolation. Consider passing a logger or a buffer object explicitly to the worker function instead of relying on global state monkey-patching.
There was a problem hiding this comment.
We keep this non-invasive. If we go this way it would be a major refactor of the scripts.
Single core processing takes 1h40m
https://github.com/googleapis/google-cloud-java/actions/runs/24492359565/job/71579750324?pr=12807
This approach takes ~1h
https://pantheon.corp.google.com/cloud-build/builds;region=us-central1/776e4ff9-415b-47e4-9c2b-afdc12e3b095?project=java-hermetic-build-prod