Update MiniMax M2.5 FP8 H200 vLLM agg recipes by anish-shanbhag · Pull Request #1298 · SemiAnalysisAI/InferenceX

anish-shanbhag · 2026-05-07T23:02:20Z

Update MiniMax-M2.5 FP8 H200 vLLM to vllm/vllm-openai:v0.20.1-ubuntu2404

Set vLLM serving knobs in benchmarks/single_node/minimaxm2.5_fp8_h200.sh: generated benchmark max-model-len, previous eval max-model-len handling, fp8 KV cache, FlashInfer attention/autotune, Triton MoE, and MiniMax QK norm fusion.

Tune MiniMax M2.5 FP8 H200 vLLM agg

e601235

github-project-automation Bot added this to InferenceMAX Board May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298

Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298
anish-shanbhag wants to merge 1 commit intoSemiAnalysisAI:mainfrom
anish-shanbhag:ashan/port-inferencemax-53-minimax-h200-no-slurm-shared

anish-shanbhag commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anish-shanbhag commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anish-shanbhag commented May 7, 2026 •

edited

Loading