Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published 3 days ago • 98
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 3 days ago • 111
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published 16 days ago • 155
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 22 days ago • 165
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding Paper • 2509.15178 • Published Sep 18, 2025 • 6
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published Dec 12, 2024 • 21