view article Article How to make NeuTTS-air generate over 200 seconds of audio in a single second. 16 days ago • 12
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models Paper • 2510.17519 • Published Oct 20 • 9
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation Paper • 2510.14974 • Published Oct 16 • 9
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling Paper • 2510.09212 • Published Oct 10 • 16
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 53
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Paper • 2510.01284 • Published Sep 30 • 33
MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech Paper • 2509.25131 • Published Sep 29 • 15
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published Sep 29 • 45
SVDQuant Collection Models and datasets for "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models" • 20 items • Updated May 29 • 64
LPD Collection Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation • 6 items • Updated Jul 2 • 2
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs Paper • 2509.08358 • Published Sep 10 • 13
Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling Paper • 2509.01624 • Published Sep 1 • 7
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference Paper • 2509.06942 • Published Sep 8 • 17