-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 157 -
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Paper • 2502.14846 • Published • 14 -
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
Paper • 2502.14377 • Published • 12
Liu
Liudawp
·
AI & ML interests
None yet
Organizations
None yet
ai tech
-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 157 -
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Paper • 2502.14846 • Published • 14 -
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
Paper • 2502.14377 • Published • 12
models
0
None public yet
datasets
0
None public yet