SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks Paper • 2606.09669 • Published 22 days ago • 46
Efficient Agentic Reinforcement Learning with On-Policy Intrinsic Knowledge Boundary Enhancement Paper • 2605.26952 • Published May 26 • 16
Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published May 20 • 111
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published May 16 • 97
MLLM-CL: Continual Learning for Multimodal Large Language Models Paper • 2506.05453 • Published Jun 5, 2025 • 4
MMEVOKE (ICLR26 🔥) Collection MMEVOKE introduces the first comprehensive benchmark and systematic evaluation framework designed to investigate multimodal evolving knowledge injecti • 4 items • Updated May 5 • 2
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 34
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published Jan 12 • 53
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 215
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search Paper • 2601.04767 • Published Jan 8 • 28
kailinjiang/llava_1.5_13b_covariance_matrices_from_onevision_pre_64_seed_rank233_new222 Updated Dec 30, 2025
kailinjiang/llava_1.5_13b_covariance_matrices_from_onevision_pre_64_seed_rank233_new Updated Dec 30, 2025