OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 5 days ago • 195
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 3 days ago • 36
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published 12 days ago • 339
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 11 days ago • 47
ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models Paper • 2603.19466 • Published 22 days ago • 41
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published Mar 10 • 75
Running on CPU Upgrade 219 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 219 Explore synthetic data experiments on a virtual bookshelf
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published Feb 24 • 102
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 263
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 216
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 194
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 352