please GGUF

by houxiaowei - opened Sep 28

Discussion

houxiaowei

Sep 28

please GGUF

floory

Sep 28

it wouldn't matter much. this model isn't supported by llama.cpp, and you need to follow their vllm or sglang instructions.

engrtipusultan

5 days ago

@zhanghanxiao llama.cpp now has reference implementation of linear attention by providing qwen3 next. Can you guys add support for your model as well.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment