PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 18 days ago • 64
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 12 days ago • 144
DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams Paper • 2606.21337 • Published 16 days ago • 74
ReMoT: Reinforcement Learning with Motion Contrast Triplets Paper • 2603.00461 • Published Mar 20 • 1
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published Apr 21 • 88