5 516 1

Literate Goggles

literate-goggles

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

upvoted an article 4 days ago

Continuous batching from first principles

upvoted an article 4 days ago

Diffusers welcomes FLUX-2

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 4 days ago • 141

upvoted 3 articles 4 days ago

Article

Continuous batching from first principles

13 days ago

•

250

Article

Diffusers welcomes FLUX-2

13 days ago

•

157

Article

SARLO-80: Worldwide Slant SAR Language Optic Dataset at 80 cm Resolution

7 days ago

•

upvoted a paper 21 days ago

Step-Audio-EditX Technical Report

Paper • 2511.03601 • Published Nov 5 • 28

upvoted 3 papers 26 days ago

upvoted 3 papers 28 days ago

Prompt-to-Prompt Image Editing with Cross Attention Control

Paper • 2208.01626 • Published Aug 2, 2022 • 3

Generating Creative Chess Puzzles

Paper • 2510.23881 • Published Oct 27 • 1

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7, 2024 • 16

upvoted 4 papers about 1 month ago

FARMER: Flow AutoRegressive Transformer over Pixels

Paper • 2510.23588 • Published Oct 27 • 57

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Paper • 2106.06103 • Published Jun 11, 2021 • 4

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23 • 45

upvoted an article about 1 month ago

Article

Building the Open Agent Ecosystem Together: Introducing OpenEnv

Oct 23

•

135

upvoted a paper about 2 months ago

UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

Paper • 2510.13344 • Published Oct 15 • 62

upvoted 3 papers 2 months ago

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

Paper • 2410.15764 • Published Oct 21, 2024 • 1

MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

Paper • 2409.00750 • Published Sep 1, 2024 • 5

RLP: Reinforcement as a Pretraining Objective

Paper • 2510.01265 • Published Sep 26 • 40

Literate Goggles

AI & ML interests

Recent Activity

Organizations

literate-goggles's activity

Continuous batching from first principles

Diffusers welcomes FLUX-2

SARLO-80: Worldwide Slant SAR Language Optic Dataset at 80 cm Resolution

Building the Open Agent Ecosystem Together: Introducing OpenEnv