HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 4 days ago • 45
SpatialActor Collection Models and datasets of SpatialActor (https://github.com/shihao1895/SpatialActor) • 4 items • Updated Jan 9 • 1
MemoryVLA Collection Checkpoints, data and logs of MemoryVLA & MemoryVLA+. https://github.com/shihao1895/MemoryVLA • 19 items • Updated 19 days ago • 7
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11, 2025 • 62