docs: add self-contained multi-LoRA RL example (SGLang + TensorRT-LLM backends) by kiddyboots216 · Pull Request #12 · togethercomputer/xorl

kiddyboots216 · 2026-06-16T21:22:51Z

What

Adds a single self-contained example — examples/server/multilora/multilora_rl_recipe.py — for RL-style training of many LoRA adapters over one shared base model, with rollouts served by a pluggable inference engine.

The flow: create N LoRA adapters (heterogeneous rank/optimizer) → train each a few steps → export → make servable → sample per-adapter rollouts (with per-token logprobs). It has no sibling imports and inlines its session specs, so it reads and runs as one file.

Two rollout backends implement the same RolloutEngine interface:

SGLangEngine — HTTP: the training server's create_sampling_session + SGLang /generate with per-request lora_path + return_logprob.
TRTLLMEngine — TensorRT-LLM's in-process LLM API + LoRARequest + SamplingParams (moe_backend="CUTLASS", per-expert PEFT adapters).

The module header is a runbook: how to launch the XORL training server (using examples/server/configs/lora/qwen3_30b_a3b_lora.yaml), launch the rollout engine, and run the recipe.

Notes

Example only — no library code changes.
Lints clean under the repo's pre-commit hooks (ruff, ruff-format, codespell).

… backends) A single-file, engine-agnostic example for RL-style training of many LoRA adapters over one shared base: train each adapter, export it, serve it, and sample per-adapter rollouts (with token logprobs). The same recipe runs against two rollout backends behind one RolloutEngine interface — SGLang (HTTP) and TensorRT-LLM (in-process LLM API + LoRARequest). The header is a runbook for launching the XORL training server (using a LoRA config from this repo) and the rollout engine, then running the recipe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add self-contained multi-LoRA RL example (SGLang + TensorRT-LLM backends)#12

docs: add self-contained multi-LoRA RL example (SGLang + TensorRT-LLM backends)#12
kiddyboots216 wants to merge 1 commit into
mainfrom
docs/multilora-rl-recipe

kiddyboots216 commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kiddyboots216 commented Jun 16, 2026

What

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant