Skip to content

docs: add self-contained multi-LoRA RL example (SGLang + TensorRT-LLM backends)#12

Open
kiddyboots216 wants to merge 1 commit into
mainfrom
docs/multilora-rl-recipe
Open

docs: add self-contained multi-LoRA RL example (SGLang + TensorRT-LLM backends)#12
kiddyboots216 wants to merge 1 commit into
mainfrom
docs/multilora-rl-recipe

Conversation

@kiddyboots216

Copy link
Copy Markdown
Contributor

What

Adds a single self-contained example — examples/server/multilora/multilora_rl_recipe.py — for RL-style training of many LoRA adapters over one shared base model, with rollouts served by a pluggable inference engine.

The flow: create N LoRA adapters (heterogeneous rank/optimizer) → train each a few steps → export → make servable → sample per-adapter rollouts (with per-token logprobs). It has no sibling imports and inlines its session specs, so it reads and runs as one file.

Two rollout backends implement the same RolloutEngine interface:

  • SGLangEngine — HTTP: the training server's create_sampling_session + SGLang /generate with per-request lora_path + return_logprob.
  • TRTLLMEngine — TensorRT-LLM's in-process LLM API + LoRARequest + SamplingParams (moe_backend="CUTLASS", per-expert PEFT adapters).

The module header is a runbook: how to launch the XORL training server (using examples/server/configs/lora/qwen3_30b_a3b_lora.yaml), launch the rollout engine, and run the recipe.

Notes

  • Example only — no library code changes.
  • Lints clean under the repo's pre-commit hooks (ruff, ruff-format, codespell).

… backends)

A single-file, engine-agnostic example for RL-style training of many LoRA
adapters over one shared base: train each adapter, export it, serve it, and
sample per-adapter rollouts (with token logprobs). The same recipe runs against
two rollout backends behind one RolloutEngine interface — SGLang (HTTP) and
TensorRT-LLM (in-process LLM API + LoRARequest). The header is a runbook for
launching the XORL training server (using a LoRA config from this repo) and the
rollout engine, then running the recipe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant