text-embeddings Tarka-AIR/Tarka-Embedding-150M-V1 Feature Extraction • Updated Nov 18, 2025 • 426 • 7
LLMs Qwen/Qwen2-VL-2B-Instruct Image-Text-to-Text • Updated Jan 12, 2025 • 2.71M • 496 Qwen/QwQ-32B-Preview Text Generation • 33B • Updated Jan 12, 2025 • 8.74k • • 1.74k MiniMaxAI/MiniMax-M1-80k Text Generation • Updated Jul 7, 2025 • 12.1k • • 691 EssentialAI/essential-web-v1.0 Preview • Updated Oct 2, 2025 • 47.2k • 218
text-to-image Wan-AI/Wan2.2-T2V-A14B Text-to-Video • Updated Aug 7, 2025 • 18.3k • • 432 QuantStack/Wan2.2-T2V-A14B-GGUF Text-to-Video • 14B • Updated Jul 29, 2025 • 105k • 235
VLM HuggingFaceM4/Idefics3-8B-Llama3 Image-Text-to-Text • Updated Dec 2, 2024 • 179k • 302 HuggingFaceTB/SmolVLM-Instruct Image-Text-to-Text • 2B • Updated Apr 8, 2025 • 28.7k • 580 HuggingFaceTB/SmolLM3-3B Text Generation • 3B • Updated Sep 10, 2025 • 1.09M • 916
LLMs-optimizations Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 32 Qwen/Qwen2-VL-2B-Instruct Image-Text-to-Text • Updated Jan 12, 2025 • 2.71M • 496
Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 32
text-embeddings Tarka-AIR/Tarka-Embedding-150M-V1 Feature Extraction • Updated Nov 18, 2025 • 426 • 7
text-to-image Wan-AI/Wan2.2-T2V-A14B Text-to-Video • Updated Aug 7, 2025 • 18.3k • • 432 QuantStack/Wan2.2-T2V-A14B-GGUF Text-to-Video • 14B • Updated Jul 29, 2025 • 105k • 235
VLM HuggingFaceM4/Idefics3-8B-Llama3 Image-Text-to-Text • Updated Dec 2, 2024 • 179k • 302 HuggingFaceTB/SmolVLM-Instruct Image-Text-to-Text • 2B • Updated Apr 8, 2025 • 28.7k • 580 HuggingFaceTB/SmolLM3-3B Text Generation • 3B • Updated Sep 10, 2025 • 1.09M • 916
LLMs Qwen/Qwen2-VL-2B-Instruct Image-Text-to-Text • Updated Jan 12, 2025 • 2.71M • 496 Qwen/QwQ-32B-Preview Text Generation • 33B • Updated Jan 12, 2025 • 8.74k • • 1.74k MiniMaxAI/MiniMax-M1-80k Text Generation • Updated Jul 7, 2025 • 12.1k • • 691 EssentialAI/essential-web-v1.0 Preview • Updated Oct 2, 2025 • 47.2k • 218
LLMs-optimizations Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 32 Qwen/Qwen2-VL-2B-Instruct Image-Text-to-Text • Updated Jan 12, 2025 • 2.71M • 496
Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 32