Experience Transfer for Multimodal LLM Agents in Minecraft Game Paper • 2604.05533 • Published 7 days ago • 13
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Paper • 2604.05091 • Published 8 days ago • 43
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? Paper • 2604.03016 • Published 11 days ago • 36
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 12 days ago • 840
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 32
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 12 days ago • 138
Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 16 days ago • 51
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 15 days ago • 57
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published 16 days ago • 30
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design Paper • 2603.28376 • Published 15 days ago • 22
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published 20 days ago • 28
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 48
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 19 days ago • 50
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models Paper • 2603.25744 • Published 19 days ago • 13
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 20 days ago • 96
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 20 days ago • 53
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 19 days ago • 130