Demystifying the Slash Pattern in Attention: The Role of RoPE Paper • 2601.08297 • Published 16 days ago • 3
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14, 2025 • 45
Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning Paper • 2510.14095 • Published Oct 15, 2025 • 6
Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation Paper • 2510.15624 • Published Oct 17, 2025 • 15
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published Jul 30, 2025 • 47
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published Jun 17, 2025 • 44
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders Paper • 2506.14002 • Published Jun 16, 2025 • 5
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26, 2025 • 59
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework Paper • 2503.10704 • Published Mar 12, 2025 • 5