OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper โข 2511.16334 โข Published 17 days ago โข 91
Scaling Spatial Intelligence with Multimodal Foundation Models Paper โข 2511.13719 โข Published 19 days ago โข 44
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper โข 2511.13648 โข Published 20 days ago โข 52
First Try Matters: Revisiting the Role of Reflection in Reasoning Models Paper โข 2510.08308 โข Published Oct 9 โข 24
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper โข 2510.05094 โข Published Oct 6 โข 37
4DNeX: Feed-Forward 4D Generative Modeling Made Easy Paper โข 2508.13154 โข Published Aug 18 โข 62
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper โข 2505.17022 โข Published May 22 โข 27
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper โข 2505.10554 โข Published May 15 โข 120
Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family Paper โข 2504.18225 โข Published Apr 25 โข 13
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper โข 2503.16365 โข Published Mar 20 โข 40