feat: LTX-2 support by pwilkin · Pull Request #1458 · leejet/stable-diffusion.cpp

pwilkin · 2026-04-23T23:48:24Z

Please have mercy, had to murder my Claude Code to get this working.

SD_CUDA_DEVICE=1 SD_CUDA_DEVICE_CLIP=-1 SD_CUDA_DEVICE_VAE=0 timeout 1800 ./bin/sd-cli -M vid_gen \
    --diffusion-model /media/ilintar/D_SSD/models/ltx-2/ltx-2.3-22b-dev-Q5_K_S.gguf \
    --llm /media/ilintar/D_SSD/models/ltx-2/gemma-3-12b-it-qat-IQ4_XS.gguf \
    --vae /media/ilintar/D_SSD/models/ltx-2/ltx-2.3-22b-dev_video_vae.safetensors \
    -m /media/ilintar/D_SSD/models/ltx-2/ltx-2.3-22b-dev_embeddings_connectors.safetensors \
    --gemma-tokenizer /home/ilintar/.cache/huggingface/hub/models--google--gemma-3-12b-it/snapshots/96b6f1eccf38110c56df3a15bffe176da04bfd80/tokenizer.json \
    -W 640 -H 480 --video-frames 25 --steps 60 --fps 24 --cfg-scale 6.0 --seed 42 \
    -p "a cat walking on a sandy beach at sunset, cinematic, 4k" \
    -o /tmp/ltx2_smoke.webm

ltx2_smoke_v2.webm

Green-Sky · 2026-04-24T09:39:55Z

I think there is some good stuff we can pull out of here (:

btw, gemma-3-12b-it-qat-IQ4_XS.gguf why iq4 of qat?

pwilkin · 2026-04-24T10:41:45Z

@Green-Sky that's a very good question, probably "because I wasn't thinking about it" is the proper answer ;)

JohnLoveJoy · 2026-04-24T13:40:58Z

Great work. How does this perform compared to ComfyUI?

pwilkin · 2026-04-24T13:51:46Z

Haven't compared yet but gonna optimize further.

mudler · 2026-04-24T14:51:47Z

wow! was actually playing with it myself as well with Claude letting it go by itself. Will open up a PR just for reference, got this working with claude as well yesterday

this is the result I got with it

pwilkin · 2026-04-24T14:52:39Z

Slightly funky still, so guess there's a subtle error somewhere, but I added fitting, so I managed to get 80 frames at 720p ("a black cat jumping at a brown mouse on green grass"):

ltx2_cat_mouse_720p.webm

pwilkin · 2026-04-24T14:57:15Z

@mudler yours looks much better, wonder if that's quants or if my implementation has a bug somewhere.

Edit: might be distilled vs full too though.

pwilkin · 2026-04-24T15:07:00Z

Probably FA is the culprit here - I'm running this on 26 GB VRAM total (3080 10 GB + 5060 16 GB), so really struggling to get anything reasonable :)

wbruna · 2026-04-24T15:24:55Z

+    //   SD_CUDA_DEVICE_VAE      VAE                          (falls back to SD_CUDA_DEVICE)
+    //   SD_CUDA_DEVICE_CONTROL  ControlNet                    (falls back to SD_CUDA_DEVICE)
+    //   SD_VK_DEVICE            same pattern for the Vulkan build
+    // Setting any of these to -1 forces CPU for that component.


Just as a reminder: this should be coordinated with #1184 .

Yeah, this is just a rough PoC for now.

mudler · 2026-04-24T15:30:39Z

@mudler yours looks much better, wonder if that's quants or if my implementation has a bug somewhere.

Edit: might be distilled vs full too though.

I'm using the distilled model:

~/ltxv-sd-cpp/build-cuda/bin/sd-cli -M vid_gen \                              
    -m ltxv-models/ltx-2.3-22b-distilled.safetensors \                                                                                                                                                    
    --text-encoder gemma-3-12b-it \                                                                                                                                                                       
    -p 'a cat walking across a grassy field' \                     
    -W 768 -H 512 --video-frames 121 \                                                                                                                                                                    
    --steps 8 --cfg-scale 1 \                                                   
    -o /tmp/ltx23_clean.webp --seed 42

pwilkin · 2026-04-24T15:42:42Z

@mudler yeah I'm doing full for some reason (probably the same one that caused me to pick IQ4_XS :D)

pwilkin · 2026-04-24T20:21:58Z

So apparently there are some major divergences between CPU and CUDA Gemma3, which is a bit surprising (and it happens on both Q4_0 and the IQ4_XS quants).

LTX-2 first version

246e7ee

mudler mentioned this pull request Apr 24, 2026

feat: add LTX-2 video generation support #1459

Open

wbruna reviewed Apr 24, 2026

View reviewed changes

Add backend fitting, some fixes

26ea8ea

Conversation

pwilkin commented Apr 23, 2026

Uh oh!

Green-Sky commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

JohnLoveJoy commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

mudler commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

wbruna Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

mudler commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

pwilkin commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pwilkin commented Apr 24, 2026 •

edited

Loading