[WIP] Updated DSv4 vllm B300 MTP by wzhao18 · Pull Request #1271 · SemiAnalysisAI/InferenceX

wzhao18 · 2026-05-04T02:15:22Z

No description provided.

github-actions · 2026-05-04T02:15:32Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-05-04T02:17:31Z

Claude finished @wzhao18's task in 1m 28s —— View job

Review of PR #1271

Gather context and read changed files
Validate master config changes
Validate perf-changelog.yaml
Validate benchmark script

Summary

LGTM - no blocking issues found.

What this PR does: Upgrades the vLLM image from v0.20.0-cu130 to v0.20.1-cu130 for the B300 MTP config and adds new data-parallel attention + expert parallelism search space entries. The benchmark script is updated to conditionally set --moe-backend deep_gemm_mega_moe and MAX_NUM_BATCHED_TOKENS=2048 when DP_ATTENTION=true.

Checks passed:

Image vllm/vllm-openai:v0.20.1-cu130 is publicly accessible on Docker Hub
Model prefix dsv4 is valid
perf-changelog.yaml entry is correctly appended at the end of the file
Expert parallelism is correctly conditional on EP_SIZE (pre-existing)
--dsv4 flag implies --use-chat-template (verified in benchmark_lib.sh)
Server launch command is properly formatted with line continuations

github-actions · 2026-05-04T04:03:25Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25297833177
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25297833177

github-actions · 2026-05-04T12:41:58Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25300229552
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25300229552

add DP to b300 mtp

99753d6

github-project-automation Bot added this to InferenceMAX Board May 4, 2026

Update changelog

e5c062d

wzhao18 added the full-sweep-enabled label May 4, 2026

wzhao18 marked this pull request as ready for review May 4, 2026 02:17

wzhao18 requested a review from a team May 4, 2026 02:17

wzhao18 requested review from jgangani and kedarpotdar-nv as code owners May 4, 2026 02:17

Update Docker image version for dsv4-fp4-b300-vllm-mtp

09a7294

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Updated DSv4 vllm B300 MTP#1271

[WIP] Updated DSv4 vllm B300 MTP#1271
wzhao18 wants to merge 3 commits intomainfrom
nv/dsv4-b300-mtp-new

wzhao18 commented May 4, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

claude Bot commented May 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wzhao18 commented May 4, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

claude Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #1271

Summary

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 4, 2026 •

edited

Loading