IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval Paper • 2503.04644 • Published Mar 6, 2025 • 22
MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism Paper • 2606.07512 • Published 11 days ago • 38
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding Paper • 2606.05259 • Published 13 days ago • 38
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published 28 days ago • 82
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published Jan 21, 2025 • 83