[codex] Use PR41015 fix setup for GB200 MTP2 high throughput#1273
[codex] Use PR41015 fix setup for GB200 MTP2 high throughput#1273alec-flowers wants to merge 1 commit intomainfrom
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
Claude finished @alec-flowers's task in 41s —— View job Review
LGTM — no blocking issues found. The change is a single-line swap of |
There was a problem hiding this comment.
LGTM — single-line YAML setup_script swap for the GB200 MTP2 high-throughput benchmark recipe.
Extended reasoning...
Overview
The PR makes a one-line change to a single benchmark recipe YAML (benchmarks/multi_node/srt-slurm-recipes/vllm/deepseek-v4/8k1k/disagg-gb200-high-tpt-megamoe-mtp2.yaml), changing setup_script from vllm-container-deps.sh to vllm-container-deps-pr41015-fp4-fix.sh so this recipe picks up the PR41015 FP4 revert wrapper that recovered GB200 MTP2 throughput.
Security risks
None. The change is a benchmarking config knob; no auth, crypto, network exposure, or production code paths are touched.
Level of scrutiny
Low. This is a benchmark recipe in benchmarks/multi_node/srt-slurm-recipes/ that is scoped to a single GB200 MTP2 high-throughput run. The referenced patch script lives in an external srt-slurm branch (aflowers/vllm-gb200-v0.20.0), and the PR description confirms its existence and that it parses cleanly. No other recipes are affected.
Other factors
Bug hunting found nothing. The change follows the existing pattern of pointing recipes at a setup script by name, mirrors recent commits in this area (e.g. 07df028, 0f630e1), and is trivially reversible.
Summary
vllm-container-deps-pr41015-fp4-fix.sh.aflowers/vllm-gb200-v0.20.0srt-slurm branch.Why
This lets the InferenceX rerun use the srt-slurm PR41015 FP4 revert wrapper that recovered the GB200 MTP2 throughput regression in SA testing.
Validation
setup_scriptresolves tovllm-container-deps-pr41015-fp4-fix.sh.configs/patches/vllm-container-deps-pr41015-fp4-fix.shandconfigs/patches/vllm_revert_pr41015_fp4_cvt.py.bash -non the srt-slurm wrapper andpython3 -m py_compileon the patch helper.