Optimal setup for low VRAM conditions.

#4
by BigDeeper - opened

It appears that it is possible to run LTX-2 with low VRAM, if one allows use of the host's RAM.

Although I have 4 (older) GPUs with 12.2MiB each and one with 4GiB, ComfyUI uses only one GPU. Obviously, it is an unwelcome limitation.

With some previous video models, I was able to crudely distribute models over multiple GPUs, with different model per GPU. Again, not very optimal.

llama.cpp is able to read in models in GGUF format and distribute layers to multiple GPUs + CPU as needed.

Question 1: Are there nodes under ComfyUI that can take an LTX model in GGUF format and do the same as llama.cpp?

Question 2: Could I actually run the LTX GGUF model with llama.cpp, and have ComfyUI interact with llama.cpp API to do its work?

Any other thoughts on how I can leverage my hardware better to run workflows, and get output within reasonable time, at decent resolution, and long enough duration to convey an idea (12-15 seconds), would be welcome.

Sign up or log in to comment