view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 745
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16 • 57
Quantization Spaces on the Hub ⚡ Collection A collection of spaces that allow you to quantize on the Hub • 4 items • Updated Nov 28 • 7
Reasoning Router Collection Route between “thinking” and “no-thinking” modes for hybrid models like Qwen3. Blog: https://huggingface.co/blog/AmirMohseni/reasoning-router • 9 items • Updated Nov 16 • 2
Scaling Test-Time Compute with Open Models Collection Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated Jan 6 • 27
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 241