DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 55
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Paper • 2410.10792 • Published Oct 14, 2024 • 31
Memory-Efficient LLM Training with Online Subspace Descent Paper • 2408.12857 • Published Aug 23, 2024 • 15
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating Paper • 2505.10860 • Published May 16, 2025 • 1
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild Paper • 2403.16973 • Published Mar 25, 2024 • 3
DataComp: In search of the next generation of multimodal datasets Paper • 2304.14108 • Published Apr 27, 2023 • 2
AMO Sampler: Enhancing Text Rendering with Overshooting Paper • 2411.19415 • Published Nov 28, 2024 • 5
Image and Video Tokenization with Binary Spherical Quantization Paper • 2406.07548 • Published Jun 11, 2024 • 1
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation Paper • 2506.17213 • Published Jun 20, 2025 • 4
BAT: Learning to Reason about Spatial Sounds with Large Language Models Paper • 2402.01591 • Published Feb 2, 2024 • 1