EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published 21 days ago • 12
view article Article Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents 3 days ago • 39
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 5 days ago • 113
view article Article Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers 16 days ago • 66
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 24 days ago • 34
Running Featured 75 Distilling 100B+ Models 40x Faster with TRL 📝 75 TRL distillation for 100B+ teachers, 40x faster
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory Paper • 2604.01007 • Published 30 days ago • 31
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 30 days ago • 884
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 131
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published Mar 24 • 35
Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF Image-Text-to-Text • 4B • Updated 26 days ago • 59.7k • 119
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 146