Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 18 days ago • 125
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making Paper • 2602.06570 • Published Feb 6 • 61
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 229
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 124
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 5 days ago • 247
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 264
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 161