please GGUF

#1
by houxiaowei - opened

please GGUF

it wouldn't matter much. this model isn't supported by llama.cpp, and you need to follow their vllm or sglang instructions.

@zhanghanxiao llama.cpp now has reference implementation of linear attention by providing qwen3 next. Can you guys add support for your model as well.

Sign up or log in to comment